Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

April 22 2011

Shopping for APIs

Calling itself the "first ever APIs marketplace," Mashape launched last fall to provide better access and distribution for APIs, and to help developers earn a little cash.

I asked Mashape CEO Augusto Marietti how a marketplace for APIs fits into the growing data economy and why he sees similarities between his company and Etsy. Our interview follows.

Mashape's process
Mashape's API distribution process

What role will marketplaces have in supporting a data economy?

Augusto Marietti: The main mission of a marketplace is to facilitate the exchange of data. It's a place where you can find, distribute, and buy with the same billing mechanism. This pushes down the barriers for people to participate, and they buy and sell with more confidence and trust. There may be more freedom outside a certain marketplace, but the risk is higher. Is the seller trusted? Can you get your money back? Going outside would mean having more freedom, but the risk is higher.

In a marketplace, you come to know the trusted seller, and there's a community that supports the buyers and the sellers. It relies on reputation. That, in part, makes you feel more confident with what you're buying.

You call yourself the "Etsy of cloud services." Why do you compare yourself to Etsy as opposed to another type of platform?

Augusto Marietti: The unique thing about Etsy is that the products available through it are handmade. We see the world of APIs in the same light.

Ten years ago APIs were only created by big corporations, and even five years ago, many web companies launched without APIs. Now small startups are born with APIs from almost day one. Even more notably, APIs aren't created just by companies, but by individual developers. These developers have lot of projects that are dying somewhere in their folders, and now they can easily distribute and sell their little projects through the marketplace. That's the "Etsy concept."

How can an API marketplace address distribution and discovery?

Augusto Marietti: In some way what's important isn't the things that are being traded inside the marketplace but the community that works as its underlying architecture. You want a big community. You want to go where other people are, and often this means you aggregate around interests. APIs are an interest. It's better to distribute your API in a small place where 100% of the users are familiar with what you're selling.

But the buyer also wants to go to the supermarket that has everything. That puts pressure on the producer to put their products there. You can open your private shop and try by yourself, but if you don't have something special or you're selling a general purpose item — and APIs are becoming a general purpose item — you have to wonder how long you can stay in business.

Obviously, you may lose control over your product if it's in someone else's store. That's why the community part of the marketplace is important. But in our case, we're not trying to lock someone into the marketplace. With an API, it's not a "here or there" scenario.

This interview was edited and condensed.


April 20 2011

Uniform APIs for the data web

The elmcity service connects to a half-dozen other services, including Eventful, Upcoming, EventBrite, Facebook, Delicious, and Yahoo. It's nice that each of these services provides an API that enables elmcity to read their data. It would be even nicer, though, if elmcity didn't have to query, navigate, and interpret the results of each of these APIs in different ways.

For example, the elmcity service asks the same question of Eventful, Upcoming, and EventBrite: "What are the titles, dates, times, locations, and URLs of recent events within radius R of location L?" It has to ask that question three different ways, and then interpret the answers three different ways. Can we imagine a more frictionless approach?

I can. Here's how the question might be asked in a general way using the Open Data Protocol (OData):

http://odata.[eventful|upcoming|eventbrite].com/events?$filter=type eq 'recent' and radius lt 5

An OData reply is a feed of Atom entries, optionally annotated with types. Here's a sketch of how one of those entries might look as part of a general OData answer to the question:

  <title type="text">Carola Zertuche presents traditional flamenco</title>
  <author><name /></author>
    <d:start m:type="Edm.DateTime">2011-05-12T12:00</d:start>
    <d:latitude m:type="Edm.Double">37.809600000000003</d:latitude>
    <d:longitude m:type="Edm.Double">-122.4106</d:longitude>

(With the addition of $format=json to the query URL, the same information arrives as a JSON payload.)

Of course there would still be differences among these APIs. Each of the three services in this example has its own naming conventions and its own way of modeling events and venues. It would still take some work to abstract away those differences. But you'd be using a common query mechanism, a common set of data representations, a common way of linking them together, and a common set of helper libraries for many programming environments.

A WordPress thought experiment

Blog publishing systems have long implemented APIs that enable client applications to fetch and post blog entries. For historical reasons there are a variety of these APIs. Because they're widely adopted in the blog domain, it's pretty likely that an application that works with one blog system's implementation of one of the APIs will work with another blog system's implementation of the same API. But these APIs are specific to the blog domain.

What if blogs had come of age in an era when a uniform kind of API was expected? We could then ask questions of blogs in the same way we could ask questions of event services in the hypothetical example shown above, or of any other kind of service. And we could interpret the answers in the same way too.

Suppose we want to ask a blog service: "What are the published entries since April 10, 2011?" Here's an OData version of the question:$filter=post_date gt datetime'2011-04-10' and post_status eq 'publish'

And here's an answer, in JSON format, from a hypothetical WordPress OData service:

{"d" : { "results": [ 
  { "__metadata": {
    "uri": "", 
    "type": "wordpress.wp_posts" }, 
    "comment_count": "0", 
    "comment_status": "open", 
    "guid": "", 
    "ID": "7", 
    "post_author": "1", 
    "post_content": "OData as universal API?", 
    "post_date": "\/Date(1303216978000)\/", 
    "post_name": "a-wordpress-thought-experiment", 
    "post_status": "publish", 
    "post_title": "A WordPress thought experiment"

Except it's not hypothetical! The guid shown in this example points to a real WordPress post. And the uri in the example points to a live OData service that emits the chunk of JSON we see here. If you're so inclined, you can start at the root of the service and explore all the tables used in that WordPress blog.

How is this possible? I'm running WordPress on Azure; this instance of WordPress uses the SQL Azure database; the database is OData-enabled. In this case I'm allowing only read access. But if the database were writable a blog client could add new entries by sending HTTP POST requests with Atom payloads.

OData for MySQL

Of course WordPress more typically runs on MySQL. Can we do the same kind of thing there? Sort of. Here's a query that fetches posts from a Linux/MySQL instance of WordPress and returns them as an Atom feed with OData annotations:

In this case the OData view of the underlying MySQL database is provided by MySQLOData, a "PHP-based MySQL OData Server library which exposes all data within a MySQL database to the world in OData ATOM or JSON format."

There are two issues here. One is my fault. I'm not fluent in PHP and I haven't been able to get MySQLOData working to its full capability. Do you know of a live instance of MySQLOData that is properly installed and configured? If so please show me the URL, I'd like to try it out.

The second issue is more fundamental. Suppose MySQLOData becomes a full implementation of OData. In any environment where there is PHP and MySQL, any application built on MySQL could automatically expose an API based on a common query mechanism, a common set of data representations, a common way of linking them together, and a common set of helper libraries. Great! But what if there's no PHP in the environment? What if there's only Python? Or only Ruby? A Django- or Rails-based service shouldn't have to add PHP to the mix in order to provide a uniform API.

If MySQL itself could present an OData interface, then layered services written in any language could automatically provide APIs in a standard way. Here's a description of how that might work:

If we provide access to existing databases as though they were in hypertext form, the system will get off the ground quicker ... What is required is a gateway program which will map an existing structure onto the hypertext model, and allow limited (perhaps read-only) access to it.

If you know your web history that may sound familiar. It's from Tim Berners-Lee's 1989 proposal for the World Wide Web.

There's more than one way to do it

Of course OData isn't the only way services could automatically provide uniform APIs. Such things typically come in several flavors. In the blog domain there have always been a few of them: the Blogger API, the metaWeblog API, etc. I think it's unlikely that we'll end up with a single flavor of uniform API. But right now we don't have any uniform flavor! Every service that provides an API has to invent its own query mechanism, data representations, and helper libraries. If you want to mash up services — as we increasingly do — the differences among these APIs create a lot of friction.

OData looks to me like one good way to overcome that friction. I'd love to see OData gateways co-located with every popular database. With such gateways in place, the web of data we're collectively trying to build would get off the ground quicker.


March 21 2011

A writable API competition

In conjunction with the Fluidinfo writable API for O'Reilly books and authors that was announced today, we're holding a developer competition.

[Disclosure: Tim O'Reilly is an investor in Fluidinfo.]

Unlike a normal API that provides access to read-only data, a "writable API" is a shorthand for one whose underlying data is openly writable. That's the fundamental property of Fluidinfo's data model. We're very curious to see what developers make of it. In other words, instead of the usual case in which a read-only API is released and programmers are encouraged to simply consume its data or contribute only in anticipated ways, the new API allows programmers to add additional data to the exact same objects that are holding the O'Reilly data, and the new data can be anything at all. There's no need to stop to ask permission, and there's no need for anyone to have anticipated what an application writer might want to do.

The writable nature of Fluidinfo-based APIs opens the door to a richer world of data and applications. A single application could add new data and combine it with the existing data. A second application could further enhance and mash up the O'Reilly data and that of the first application, and so on.

To give very simple examples, an application could tag book objects to indicate that a user owns them or is reading them, could add users' current page numbers, add links to the book elsewhere, or add any other metadata it pleases. Applications can also add tags (with values) to the author objects. These could indicate things like the author's Twitter name, a link to their profile on LinkedIn, a measure of influence, a tag to show that the author is known by a user or is someone the user would like to meet, etc.

Competition details

We imagine entries will fall into three rough classes:

  1. Uses of the basic O'Reilly book and author data, such as building a different UI to books and authors.
  2. Interesting data added to the Fluidinfo book and/or author objects. Entries in this class would not build applications.
  3. Mashups of original and new data: add to the original data, and write an application that combines both in a provocative way.


Entries will be judged by Tim O'Reilly, O'Reilly editor Mike Loukides, and O'Reilly GM Joe Wikert.


In total, three prizes will be awarded:

  • 1st prize: An OSCON package that includes a full conference pass, coach airfare from within the US, and 4 nights hotel accommodation.
  • 2nd prize: Choice of either one 3G iPad 2 64GB or one Xoom tablet 32GB (second prize includes device only, no wireless service is included).
  • 3rd prize: $500 worth of O’Reilly ebooks and/or videos; selection to be at third prize winner’s discretion.


The competition opens today (12:01 a.m. Pacific, March 21, 2011) and runs until 11:59 p.m. (Pacific) April 10, 2011. Winners will be announced on Radar on or around May 1, 2011.


Employees of O'Reilly Media and Fluidinfo are not eligible to enter the competition.

Huge apologies to our international friends, but this contest is OPEN ONLY TO DEVELOPERS WHO ARE LEGAL RESIDENTS OF THE 50 UNITED STATES AND DISTRICT OF COLUMBIA. Our legal folks worked valiantly to include everyone, but rules governing contests in each of your countries have to be followed. In the end they could not conjure up the legal magic necessary to draft rules for each of them; things are sticky enough between the boundaries within the United States. We're really sorry on this one!

How to enter and get going

Get a Fluidinfo account

You'll need to create a Fluidinfo account for your application and use its credentials to make calls to the API. If you build an application hosted on its own domain, you can use your domain name as your username in Fluidinfo.

Let the world know you're entering

As a warm-up exercise in using the Fluidinfo API, we're going to get you to tag a Fluidinfo object to indicate that you're entering the competition.

Here's what you need to do: Suppose your application's username in Fluidinfo is "" You create a tag named "" and put an instance of it onto the Fluidinfo object whose about value is "O'Reilly Fluidinfo API competition." The value of the tag should be the URL of the home page of your contest entry. If you are just adding book and/or author data to Fluidinfo, the URL you provide should describe the data you've added.

The tag therefore serves two purposes: its existence indicates that you've entered, and its value points to your entry. Note that your tag must be on the object when the contest closes in order to enter. If it is not, we won't know you have entered.

Voting for other entries

To cast a vote for an entrant, put a tag called "" onto the Fluidinfo object that corresponds to the URL of the entry. The judges will take these votes into consideration in their final decision. You can discover what other entries are out there by looking at the tags on the "O'Reilly Fluidinfo API competition" object.

The O'Reilly data in Fluidinfo

The details of the Fluidinfo objects holding the O'Reilly book and author data, and the tags on them, are described in a post on the Fluidinfo blog. On the blog you'll also find a post showing example queries against the O'Reilly API.

You can also take a look at the O'Reilly namespace in the Fluidinfo Explorer (click on the namespace in the left panel to see our top-level tags and the sub-namespaces), and can also look at individual book and author objects. For example here's the Fluidinfo object for Python in a Nutshell, Second Edition and the object for its author Alex Martelli.

Other tags on O'Reilly book data in Fluidinfo

The O'Reilly book objects also have other tags on them to give you some extra initial material to work with, and also to give you ideas. If you look at the object for Python in a Nutshell mentioned above, you'll see that as well as lots of tags, it also has tags from Amazon, Google Books, LibraryThing, and Goodreads.

A simple example Chrome extension

There's also a simple Chrome extension for O'Reilly books. This is intended to illustrate how a browser extension can pull additional information about a book from Fluidinfo and show it to the reader. If you'd like to build a browser extension, you can grab the code from Github and take it from there. If you're using Chrome, you can install the extension by following these instructions. The Fluidinfo blog has details on how to use it.

Fluidinfo object model and API

You can find out more about the Fluidinfo object model and its API on the developer's page. You can also often get help in real time by joining the #fluidinfo channel on (in fact you can even join the channel with a web-based client right from this web page).

Contest rules

Here's the fine print.



Important: Please read these Official Rules before entering the O’Reilly Writable API Contest (the "Contest") sponsored by O’Reilly Media, Inc., and Fluidinfo, Inc. (each a “Sponsor”, and collectively “Sponsors”).

1. BINDING AGREEMENT: In order to enter the Contest, you must agree to these Official Rules (“Rules”). Please read these Rules carefully; these Rules will form a legally binding agreement with respect to this Contest and you will be bound by them. You may not submit an Entry (as defined in Section 4 below) unless you agree to these Rules. You agree that participation in this Contest and/or submission of an Entry in the Contest constitutes your full and unconditional agreement to these Rules and Sponsors’ decisions, which are final and binding in all matters related to the Contest.

2. ELIGIBILITY: Contest open to all developers who are legal residents of the 50 United States and the District of Columbia, who are located in the United States or the District of Columbia at the time of entry, and who are of the age of majority in their state of residence at the time of entry. Employees, directors and officers of Sponsors, their respective subsidiaries, affiliates, distributors, retailers, agents, advertising and promotional agencies, and members of their immediate family (spouse, parents, children, sibling and their respective spouse) are not eligible to participate. Void outside of the United States and where otherwise prohibited by law. Contest is subject to all applicable federal, state and local laws and regulations.

3. CONTEST DESCRIPTION & GUIDELINES: During the Contest Period, developers have the opportunity to develop their own API (the “Application”) using the Fluidinfo API for O’Reilly books and authors developed by Fluidinfo ( (“Fluidinfo O’Reilly API”),

Unlike a normal API that provides access to read-only data, a "writable API" is shorthand for one whose underlying data is openly writable. That's the fundamental property of Fluidinfo's data model, and we are very curious to see what developers make of the FLUIDINFO O’REILLY API ( In other words, instead of the usual case in which a read-only API is released and programmers are encouraged to simply consume its data or contribute only in anticipated ways, the new API allows programmers to add additional data to the exact same objects that are holding the O'Reilly data, and the new data can be anything at all. The writable nature of Fluidinfo-based APIs opens the door to a richer world of data and applications. A single application could add new data and combine it with the existing data. A second application could further enhance and mash up the O'Reilly data and that of first application, and so on. To give very simple examples, an application could tag book objects to indicate that a user owns them or is reading them, could add users' current page numbers, add links to the book elsewhere, or add any other metadata it pleases. Applications can also add tags (with values) to the author objects. These could indicate things like a measure of influence, a tag to show that the author is known by a user or is someone the user would like to meet, etc. While we want you to develop the Application however you see fit, here are some suggestions on what you may want to develop, none of these are required in an entry and are independent of the judging criteria: (a) best use of the basic O'Reilly book and author data (the challenge is to take the O'Reilly book and author data and do something interesting with it - such as building a different UI to books & authors); (b) most interesting data added to the Fluidinfo book and/or author objects (what most interesting data can be added to the Fluidinfo objects that hold the book and author information -- this could be something exotic, such as information computed about authors or alternate covers for O'Reilly books), or (c) best mashup of original and new data (how does the application best add new data and present).

  • Create a Fluidinfo Account: To develop your Application, you will need you to create a Fluidinfo account only if you plan to add data to the Fluidinfo O’Reilly API in order to make those calls to the API. (Note that if you're building an application that will have its own domain, you can use your domain name as your username in Fluidinfo.) You do not need to have a Fluidinfo account or authenticate your API requests, if your Application will just be simply displaying O'Reilly and other public data.
  • The O'Reilly data in Fluidinfo: The details of the Fluidinfo objects holding the O'Reilly book and author data, and the tags on them, are described in a post on the Fluidinfo blog ( . You can also see some example command line queries against the API in this post on the Fluidinfo blog. You can also take a look at the O'Reilly namespace in the Fluidinfo Explorer (click on the namespace in the left panel to see our top-level tags and the sub-namespaces), and can also look at individual book and author objects.
  • A simple example Chrome extension: There's also a simple Chrome extension for O'Reilly books. This is intended to illustrate how a browser extension can pull additional information about a book from Fluidinfo and show it to the reader. If you'd like to build a browser extension, you can fork the code on Github and take it from there.
  • Fluidinfo object model and API: You can find out more about the Fluidinfo object model and its API on Fluidinfo developer's page. You can also often get help in real time by joining the #fluidinfo channel on (in fact you can even join the channel with a web based client right from this web page).

  • 4. HOW TO ENTER: Contest begins at 12:01:00 a.m. Pacific Time (“PT”) on March 21, 2011 and will end at 11:59:59 p.m. PT on April 10, 2011 (the “Contest Period”). During the Contest Period, developers may develop their own Applications using the Fluidinfo O’Reilly API. To enter the Contest, please visit the Contest landing page at (the “Website”) and follow the directions on how to enter the Contest and upload your Application on Fluidinfo. You must tag a Fluidinfo object with your application’s username with a value of the URL of the homepage of your entry. If you are only adding book and/or author data to Fluidinfo (as described in these Rules), the URL should describe the data added. The tag must be on the object by 11:59:59 p.m. PT on April 10, 2011 for your entry to be eligible for the Contest. In order to be eligible to win, you must provide all information requested in the Contest Entry Form. The Contest Entry Form, the Application and any other documentation and materials submitted in connection with the Contest will together constitute your entry and are collectively hereinafter referred to as “Entry”. Automated Entries are prohibited, and any use of automated devices will cause disqualification. Entries must be received by 11:59:59 p.m. PT on April 10, 2011 to be eligible for the Contest. You may enter as many times as you wish, but please do not submit duplicate or substantially similar Applications.

    All Entries become the property of Sponsors and none will be returned. All entrants must have a valid email address. Should multiple users of the same e-mail account enter the Contest and a dispute thereafter arise regarding the identity of the entrant, the entrant shall be deemed to be the Authorized Account Holder. “Authorized account holder” is defined as the natural person who is assigned an e-mail address by an Internet access provider, on-line service provider or other organization which is responsible for assigning e-mail addresses or the domain associated with the submitted e-mail address. Please see the privacy policy located at for details of Sponsors’ policy regarding the use of personal information collected in connection with this Contest. Entries must be in English. Each Entry must be the original work of the submitting entrant; created solely by the submitting entrant, must not have been submitted in any other competition and won previous awards; must not have been previously published or marketed; must not infringe third-party rights of any third party, including but not limited to copyright, trademark and right of privacy and publicity, and must be suitable for publication (i.e., not be obscene or indecent). If any Application contains any material or elements that are not owned by the entrant and/or which are subject to the rights of third parties, the entrant is responsible for obtaining, prior to submission, any and all releases and consents necessary to permit the licensing, use, and exhibition of the Application. By submitting an Entry in the Contest, you hereby warrant and represent that your Entry conforms to the entry requirements set forth herein. Sponsors reserve the right to waive the Contest entry requirements set forth herein in their reasonable discretion. Any waiver of any obligation hereunder by Sponsors do not constitute a general waiver of any obligation to entrants. Sponsors reserve the right, in their reasonable discretion, during or upon completion of the Contest Period, to request that any entrant resubmit his or her Entry which fails to comply with the Contest entry requirements prior to any judging.

    5. WINNERS SELECTION: All Entries will be judged by a qualified panel of experts who are employees of Sponsors (“Judges”). Eligible Entries shall be judged by the Judges based equally on the following criteria: (1) overall appeal; (2) overall creativity; (3) innovation of quality and features; and (4) overall usability. The entrant with the Application that receives the highest total score among all judging criteria will be the potential First Prize Winner, subject to verification. The next entrant with the Application with the next highest score will be the Second Prize Winner subject to verification, and the next entrant with the Application with the next highest score will be the Third Prize Winner subject to verification. In the event of a tie, tie breaker will be based upon the highest score in the first judging criteria, continuing thereafter to each judging criteria in order, as needed, to break the tie.

    6. WINNER NOTIFICATION: Potential winners will be notified by email on or about May 1, 2011. Potential winners are subject to verification, including verification of age. Sponsors are not responsible for any change of entrant’s email address. Any prize or prize notification returned as undeliverable or otherwise not claimed within seven (7) days after notification of prize award will be forfeited and awarded to an alternate winner. Each Winner may be required to execute and return an affidavit of eligibility and publicity, liability and other release within seven (7) days of notification attempt or prize will be forfeited and an alternate Winner will be selected. If a potential winner is found not to be eligible or not in compliance with these Official Rules, the potential winner will be disqualified. In the event that a potential winner is disqualified for any reason, Sponsors reserve the right to award the prize to an alternate Entrant even if the disqualified potential winner’s name may have been announced.

    7. PRIZES: One (1) First Prize Winner, One (1) Second Prize Winner and One (1) Third Prize Winner will each receive the following:

    First Prize Winner (1): First prize winner will receive a trip to the O’Reilly 2011 OSCON Conference to held on July 25-29, 2011 in Portland, Oregon. Prize includes round-trip, coach class air transportation for First Prize Winner from a major commercial airport near First Prize Winner’s home within the U.S. to Portland International Airport in Oregon; one (1) double occupancy standard hotel room for four (4) nights; one (1) Full Conference Pass. Approximate retail value (“ARV”): $3,500. Actual value of trip may vary based on point of departure and airfare fluctuations. Any difference between stated approximate retail value and actual value of First Prize will not be awarded. Selection of airline and hotel are solely within Sponsor’s discretion. Meals, gratuities, luggage fees, incidental hotel charges and any other travel-related expenses not specified herein are the sole responsibility of First Prize Winner. All travel must be taken on dates specified or First Prize will be forfeited and may be awarded to an alternate winner; no alternative travel dates are available. Exact travel dates and arrangements subject to availability. First Prize Winner must have all necessary identification and/or travel documents (e.g., a valid U.S. driver’s license) required for travel. Airline tickets are non-refundable/non-transferable and are not valid for upgrades and/or frequent flyer miles. All airline tickets are subject to flight variation, work stoppages, and schedule or route changes. If in the judgment of Sponsor, air travel is not required due to winner's proximity to Portland, Oregon, ground transportation will be substituted for roundtrip air travel at Sponsor's sole and absolute discretion. The difference in value will not be awarded to the First Prize winner. Sponsor shall not be responsible for any cancellations, delays, diversions or substitution or any act or omissions whatsoever by the air carriers, hotels, venue operators, transportation companies, prize providers or any other persons providing any First Prize-related services or accommodations. Additional prize award details and travel information to be provided to the First Prize winner at the time of notification. First Prize winner is also responsible for obtaining travel insurance (and all other forms of insurance) at his/her option and hereby acknowledges that Sponsor has not and will not obtain or provide travel insurance or any other form of insurance. Lost, stolen or damaged airline tickets, travel vouchers or certificates will not be replaced or exchanged.

    Second Prize Winner (1) will receive a choice of either one 3G iPad 2 64GB or one Xoom tablet 32GB (Second Prize includes device only, no wireless service is included). ARV of Second Prize is $800.

    Third Prize Winner (1) will receive $500.00 worth of O’Reilly ebooks and/or videos, selection to be at Third Prize Winner’s discretion. ARV of prize is $500.

    Total ARV of all prizes: US $4,800. All prizes amounts are in US dollars. ARV is as of the date of printing of these Rules. The difference in the value of the prize as stated herein and value at time of prize notification, if any, will not be awarded. Prizes are not transferable. No cash redemptions. No substitutions or exchanges of the prizes will be permitted, except that Sponsors reserve the right to substitute a prize of equal or greater value for any prize that becomes unavailable for any reason. The prizes are awarded "as is" and without warranty of any kind, express or implied (including, without limitation, any implied warranty of merchantability or fitness for a particular purpose). Acceptance or use of Prizes is at Winners’ own risk. All federal, state, and local taxes are the responsibility of the winner. Limit one (1) prize per person/household. Prize winners may be issued an IRS 1099 form.

    8. GENERAL CONDITIONS FOR PARTICIPATION: By submitting an Entry, each Entrant warrants that (i) the Entry does not violate any law or regulation or any right of any third party, including those laws, regulations, and rights related to copyrights, trademarks, publicity, or privacy, (ii) the Entrant has followed the Rules and has the right to grant the rights to the Entry as provided in these Rules, (iii) the Entry has not been published or submitted in any other competition; (iv) the Entry is his or her original work; (v) the Entry has not won previous awards; and (viii) publication of the Entry via various media including Web posting, will not infringe on the rights of any third party, including without limitation, third party rights in intellectual property, publicity or privacy rights. Any such entrant will indemnify and hold harmless, Sponsors and Released Parties (as defined below) from any claims to the contrary. Entry must comply with these Rules and any Terms of Service on the Website and cannot: be sexually explicit or suggestive, unnecessarily violent or derogatory of any ethnic, racial, gender, religious, professional or age group, profane or pornographic, contain nudity or any materially dangerous activity; promote alcohol, illegal drugs, tobacco, firearms/weapons (or the use of any of the foregoing), any activities that may appear unsafe or dangerous, or any particular political agenda or message; cannot be obscene or offensive, endorse any form of hate or hate group; defame, misrepresent or contain disparaging remarks about Sponsor or its products, or other people, products or companies; contain trademarks, logos or trade dress owned by others, or advertise or promote any brand or product of any kind, without permission, or contain any personal identification, such as license plate numbers, personal names, e-mail addresses or street addresses; contain copyrighted materials owned by others (including photographs, paintings and other works of art or images published on or in websites, television, movies or other media) without permission; contain background artwork unless it is an original work of the entrant, any artwork, murals, etc. that can be seen in Entries must be created solely by the entrant or entrant must be the sole owner of all copyright interests therein; contain materials embodying the names, likenesses, photographs, or other indicia identifying any person, living or dead, without permission; communicate messages or images inconsistent with the positive images and/or goodwill to which Sponsor wishes to associate; and cannot depict, and cannot itself, be in violation of any law. Sponsors do not permit the infringement of others’ rights and any use of materials not original to the entrant (except copyrighted materials owned by Sponsors) is grounds for disqualification from the Contest. Do not copy your favorite movie, book or photo or include materials, images, graphics, music or trademarks belonging to any third parties or incorporate the names, voices, likeness or personas of any party other than yourself unless you have obtained all rights necessary to permit you to use same in connection with your Entry and grant the rights herein granted to Sponsor. By submitting an Entry, entrant acknowledges that his/her Entry may be posted on Sponsors’ websites, in Sponsors’ discretion. Further, by submitting an Entry, the Entrant grants permission for Sponsors and their respective licensees and assigns to publish, post, edit, adapt, display, and otherwise use the Entry in any form, in any manner, in perpetuity, and in any and all media, without compensation of any kind to Entrant. Entrants acknowledge that Sponsors have no obligation to use or post any Entry you submit. By submitting an Entry, you agree that your submission is gratuitous and made without restriction and will not place Sponsors under any obligation, and that Sponsors are free to disclose or otherwise disclose the ideas contained in the Entry on a non-confidential basis to anyone or otherwise use the ideas without any additional compensation to you. You acknowledge that, by accepting your Entry, Sponsors do not waive any rights to use similar or related ideas previously known to Sponsors, or developed by their employees, or obtained from sources other than you. Except where prohibited by law, by submitting an Entry into the Contest, you authorize Sponsors and their agents, to use your name, likeness, Entry and all Entry submission materials, and/or prize information, in any and all media without territorial or time limitation, for any advertising, promotional, or any other purpose without further compensation to, or permission from, you. If you think that any Entry infringes your intellectual property rights, click if you wish to report it.

    9. LIMITATION OF LIABILITY: Sponsors and any of their respective parent companies, subsidiaries, affiliates, directors, officers, professional advisors, employees, and agencies (collectively, the “Released Parties”) will not be responsible for (1) any late, lost, or misrouted entries or errors in transmission; (2) any disruptions to Internet connection, injuries, losses, or damages caused by events beyond the control of Sponsors; or (3) any printing or typographical errors in any materials associated with the Contest. The Released Parties are not responsible for technical, hardware, software, or telephone malfunctions of any kind and shall not be liable for failed, incorrect, incomplete, inaccurate, garbled, or delayed electronic communications utilized in this Contest which may limit the ability to participate in the Contest. If for any reason, including infection by computer virus, bugs, tampering, unauthorized intervention, fraud, technical failures, or any other cause beyond the control of Sponsors, which corrupts or affects the administration, security, fairness, integrity, or proper conduct of this Contest, the Contest is not capable of being conducted as described in these rules, Sponsors shall have the right, at their sole discretion, to modify and/or cancel the Contest and determine winners from Entries already received or as otherwise deemed fair and equitable by Sponsors. By entering the Contest, submitting an Entry and/or accepting a prize, you release the Released Parties from any liability whatsoever, and waive any and all causes of action, for any claims, costs, injuries, losses, or damages of any kind arising out of or in connection with the Contest or acceptance, possession, use and/or misuse of any prize (including, without limitation, claims, costs, injuries, losses, and damages related to personal injuries, death, damage to or destruction of property, rights of publicity or privacy, defamation or portrayal in a false light, whether intentional or unintentional) whether under a theory of contract, warranty, tort (including negligence, whether active, passive, or imputed), strict liability, product liability, contribution, or any other theory. As a condition of entering, entrants agree (and agree to confirm in writing): (a) under no circumstances will entrant be permitted to obtain awards for, and entrant hereby waives all rights to claim, punitive, incidental, consequential, or any other damages, other than for actual out-of-pocket expenses; (b) all causes of action arising out of or connected with this Contest, or any prize awarded, shall be resolved individually, without resort to any form of class action; and (c) any and all claims, judgments, and award shall be limited to actual out-of-pocket costs incurred, excluding attorneys’ fees and court costs.


    10. GOVERNING LAW: The Contest will be governed, construed, and interpreted under the laws of the United States. Participants who violate these Rules, tamper with the operation of the Contest, or engage in any conduct that is detrimental to Sponsors, the Contest, or any other participant (as determined in Sponsors’ sole discretion) are subject to disqualification. By entering the Contest, you agree that all issues and questions concerning the construction, validity, interpretation and enforceability of these Rules, your rights and obligations, or the rights and obligations of the Sponsors in connection with the Contest, shall be governed by, and construed in accordance with, the laws of State of California, without giving effect to any choice of law or conflict of law rules (whether of the State of California or any other jurisdiction), which would cause the application of the laws of any jurisdiction other than the State of California. Participants further consent to the jurisdiction and venue of the federal, state and local courts located in San Francisco, California.

    11. WINNER'S LIST: A list of Winners will be posted at

    12. SPONSORS’ ADDRESS: O’Reilly Media, Inc., 1005 Gravenstein Hwy N., Sebastopol, CA 95472.

    Fluidinfo, Inc., 416 West 13th Street, New York, New York, 10014


    A writable API for O'Reilly

    Today we're announcing that Fluidinfo has created a writable API for O'Reilly books and authors. We're also launching a related API contest.

    [Disclosure: Tim O'Reilly is an investor in Fluidinfo.]

    We've added information to Fluidinfo for about 2,300 O'Reilly books (or books they have rights to), and about 2,000 authors. The objects in Fluidinfo are tagged with O'Reilly information, using the domain in their tag names.

    For any O'Reilly book you can use the Fluidinfo API (description, details) to obtain any of 30 tags containing information you'd expect from a regular book API: author name(s), title, price, page count, homepage, cover image, etc. For each author there are tags with details of name, works, author page on the O'Reilly site, photo, areas of expertise, etc.

    The Fluidinfo query language lets developers obtain book and author information using different combinations of these tags.

    Beyond read-only APIs

    O'Reilly already has an API (based on RDF), so why would we bother making another one?

    One answer is that with Fluidinfo it's simple to make APIs, so we did it because it's a nice example and it was easy. Another is that accessing a book via the RDF API pulls back all its metadata, so that API is not fine grained. By splitting the metadata into tags on Fluidinfo objects, it becomes easier for programs to do queries based on individual tags or to obtain particular pieces of information about books or authors.

    But there's a much more important reason. Because Fluidinfo objects don't have owners (the tags on them do, however), anyone can add information to the book and author objects that hold the O'Reilly information.

    To illustrate, we added information from Amazon, Google Books, LibraryThing, and Goodreads onto the exact same objects that hold the O'Reilly information. That means you can trivially query across data from different sources. Plus, when you look at a Fluidinfo object, such as the one for "Programming Python", you'll see information from all these places, with tags that contain corresponding domain names.

    Fluidinfo tags
    This screenshot shows some of the tags associated with the O'Reilly book "Programming Python." Click here for the full view.

    Having tags from these well-known book sites is not the end of the story. Regular people get to have a voice, too. For example, the Fluidinfo object shown above includes tags for several people who have marked "Programming Python" as a book they own.

    In Fluidinfo, objects can always be added to by anyone or any application. If you're a developer you can sign up for a Fluidinfo account and easily write applications that not only fetch information but also augment the tags on the book and author objects. You don't have to stop to ask permission to do something creative, and you don't have to put your data elsewhere as you'd need to do if the O'Reilly information was only available via a read-only API. There are plenty of open source client-side libraries to help you.

    Because the O'Reilly API provides access to openly writable data, we describe it as a writable API. It's an example of how we're trying to make the world more writable.

    To explain the details of the API, we've written two companion posts. The first explains the structure of the O'Reilly data in Fluidinfo, i.e., the book and author tags you'll find on objects in Fluidinfo. The second shows example O'Reilly API queries to give you a flavor for the ways you can access the data.

    We hope you'll find this as exciting and full of potential as we do, and that you'll join us in collectively marking up the world.


  • 3 ways APIs can benefit publishers
  • The future of publishing is writable

  • 3 ways APIs can benefit publishers

    O'Reilly just announced a new API and an associated contest. The technical details and the competition are spelled out in separate posts — and we hope you'll check those out — but the bigger ideas behind publishing-centric APIs deserve mention as well. That's the point of this piece. Below we look at three areas that influenced our own thinking on APIs.

    Benefit 1: Address the discovery problem

    Whether you're discussing ebooks, music, movies, television shows or even run-of-the-mill blog posts, the "discovery problem" is the tie that binds digital content. All that great stuff isn't worth much if people can't find it.

    The companies that can cut through this Gordian Knot will do very well for themselves. That's well documented. But the lure of riches and Google-esque clout have still not resulted in a truly useful discovery engine for digital material. Heck, even an undeniable financial success like Apple's App Store is plagued by this conundrum.

    An API can help. It won't fix the discovery problem, but disseminating content in a structured and open manner is an important first step.

    Locking material within a website, an ebook, or any other digital container means potential customers must be exposed to that specific website or ebook or container. And what are the chances of that? There's this odd notion that customers will visit a website to dig through an arbitrary taxonomy in search of your brilliant content. Yet, if you think of your own browsing habits, you'll realize that scenario almost never plays out. Categories are for content management systems, not customers.

    Releasing an API allows content to flow naturally through the tributaries of the Internet. It works with user behavior, which increases the potential for exposure. If the dots connect — and your content is useful/brilliant/entertaining — exposure leads to attention and attention leads to conversion.

    Benefit 2: A license to experiment

    We're not at the point where access to an API is assumed. That will change in time, but for now the technical obscurity of an API is actually an asset. An API is permission to experiment.

    For publishers, an API offers a way to run live tests of new models: a metadata-only API, a full-text API, perhaps even an API with promotional or advertising hooks. You can use the API as the basis for a hack day challenge (developers and editors, together at last). Or, you could pull a Google and set aside time for technically-minded employees to construct their own applications around the APIs.

    The opportunities are considerable, but it's important to realize that how you experiment is secondary to the action of conducting the experiments. An important shift occurs when you can explore without preconceived notions. An API, because it's so different from what publishers typically produce, creates an environment where those old notions don't apply.

    Benefit 3: Intentional unintentional consequences

    When Google released its Maps tool — and later, the Maps APIs — could anyone have anticipated the vibrant ecosystem it catalyzed? Or look at Twitter: API access gave birth to thousands of third-party apps, which transformed Twitter into a dominant communication platform. APIs encourage unintentional consequences like these. They're fuel for creativity.

    It's easy for publishers to self-select out of API development because a publisher has little in common with Google or Twitter. But that's the wrong way to look at it. An API isn't about mimicking big Internet companies. Rather, it's a way to inject new ideas into an organization without reorganizing the company or launching an entirely new business.

    The people who build products around an API see your content in ways that would never occur to you. These folks aren't concerned with your tradition or your goals. They care about their own ideas and their own products.

    In a serendipitous twist, that's precisely the type of thinking publishers need. Old techniques don't work anymore and the days of bolting heavy traditional methods onto agile digital frames are drawing to a close. An API is an elegant and mutually beneficial solution that taps into the boundary-free perspectives needed during this period of reinvention.

    As Joe Wikert aptly put it:

    I can't help but think that all these years we've been selling plastic children's toys like skyscrapers and cars, and the pieces were all glued together so customers could only use them the way we intended them to be used. Now we've decided to break the pieces into their component parts and let customers build whatever they want. It's like LEGOS for publishing!

    Your thoughts?

    These are just a few of the bigger implications of APIs. If you see other applications and opportunities, please weigh in through the comments.


    March 16 2011

    Developer Week in Review

    In a departure from the normally frivolous tone of this intro paragraph, I'd like to extend best wishes to the Japanese people as they deal with their numerous crises. Here's hoping things start to go better soon.

    The planet does continue to turn (if on a slightly different axis than a week ago), so here's what happened in the developer world this week.

    Mobile insecurity

    For years, Apple aficionados have taken pot shots at the (in)security of the Windows platform. But the recent Pwn2Own contest shows that Apple has some problems of their own. During the contest, white-hats were able to break security on both an iPhone and a Blackberry by exploiting a common vulnerability in WebKit.

    No one attempted to break into a Windows Mobile 7 or Android phone, which doesn't necessarily mean they're more secure, they just weren't targeted this time around. In any event, it highlights the increasing frequency with which mobile devices are being targeted for attacks.

    Twitter gives third-party developers the bird

    Most companies go out of their way to court third-party applications that want to integrate into their product. Google certainly has a ton of public APIs, as does Amazon, etc. But Twitter has decided that they should be the keeper of the one true Twitter client vision, and has shut the door on any new clients that wish to use the Twitter content.

    Mashups are one of the most powerful features of the Web 2.0 biome, and Twitter has always been a juicy source of (sometimes geotagged) real-time data about what is happening in the world. By closing off their data stream from anyone but their own developers, Twitter is both doing a disservice to the world, and potentially shooting themselves in the foot by preventing compelling new applications from being developed.

    Web 2.0 Expo San Francisco 2011, being held March 28-31, will examine key pieces of the digital economy and the ways you can use important ideas for your own success.

    Save 20% on registration with the code WEBSF11RAD

    About that self-healing part ...

    When the ARPAnet (the Internet's ancestor) was designed, one of the main features was that it could automatically route around damaged segments (handy for a military network in times of war.) However, the recent earthquake has shown just how far we've strayed from that vision. Because of static routings hardwired into the network due to business arrangements between network providers, Internet service to and from Asia took a major hit when some key trans-pacific cables were damaged.

    Rather than automatically routing around the problem, engineers had to step in and set up new manual routings to try and alleviate the disconnect. It may have only taken a few hours or a day, but in an increasingly Internet-dependent world, even a few hours can be devastating, especially in an emergency when communications are at a premium.

    Got news?

    If you have some juicy news, please send tips or leads here.


    March 04 2011

    Four short links: 4 March 2011

    1. JSARToolKit -- Javascript port of the Flash AR Toolkit. I'm intrigued because the iPad2 has rear-facing camera and gyroscopes up the wazoo, and (of course) no Flash. (via Mike Shaver on Twitter)
    2. Android Patterns -- set of design patterns for Android apps. (via Josh Clark on Twitter)
    3. Preview of Up and Running with Node.js (O'Reilly) -- Tom Hughes-Croucher's new book in preview form. Just sorting out commenting now. (via Tom on Twitter)
    4. #Blue Opens for Business -- a web app that gets your text messages. You can reply, and there's an API to give other apps read/write access. Signs the text message is finally becoming a consumer platform.

    December 17 2010

    The future of publishing is writeable

    TOC 2011Publication of information obviously includes traditional media, such as books, newspapers, magazines, music, and video. But we can generalize considerably to include blogs, tagging (e.g., Delicious, Flickr), commenting systems, Twitter, Facebook, and Myspace.

    From a biological point of view, publishing can expand to encompass all of human social signaling -- both verbal and non-verbal -- and include the myriad little acts of information production and consumption we all engage in.

    Even seen from this outer limit of generality, it's clear that digital is ushering in a rapid convergence in publishing. While some forms are born digital and online, others are being reinvented there as technological advance sets old media free. There is massive disruption -- both behind and ahead of us -- as the convergence continues.

    Three convergence trends: smaller, easier, more personal

    There are three convergence trends in publishing that are already apparent.

    One clear long-term trend is that smaller pieces of information are being published. Considering just modern digital forms of publishing, there is a roughly chronological progression toward smaller publications: emails, Usenet postings, web pages, blog posts, blog comments, tweets, tags.

    Traditional media are also being fractured into smaller pieces, particularly where the media packaging existed only to address physical quirks of the media or the act of publishing. To give one example: Popular music publishing centered on delivering albums. This was a by-product of physical equipment -- LPs, CDs, and their players -- which did not align particularly well with the more natural unit of popular musical output, the song. Given low-cost flexible alternatives, it's no wonder that these forms of content are now jumping the packaging ship and going directly digital in pieces that make more sense. This leaves traditional publishers scratching their heads and clinging to increasingly irrelevant and anachronistic packaging methodologies -- newspapers being another example -- with attendant declining advertising possibilities. Clay Shirky has written and spoken with insight and eloquence on these changes (see here and here).

    A second trend is a reduction in friction. As access to easy-to-use and inexpensive publishing technology increases, it becomes economically feasible to publish smaller and less valuable pieces of content. We have reached the point where anyone with access to the Internet can easily and cheaply publish trivial, tiny pieces of information -- even single words.

    The third trend is the rise of publishing personal information. Our inescapable sociability is driving us to shape the Internet into a mechanism for publishing information about ourselves.

    These three trends -- smaller, easier, more personal -- provide a framework to examine the development of online information publishing.

    The three trends and the future of books

    Over the last few months, interesting discussion has arisen about the future of books and publishing. One provocative example is Hugh McGuire's post "The line between book and Internet will disappear." Let's consider what the trends of smaller, easier, and more personal might tell us about Hugh's topic: the future of books.

    First, these trends reinforce Hugh's claim that the line between book
    and Internet will disappear. The forces of convergence in publishing
    are surely strong enough to drag the book across that line. But more
    specifically, which of these trends will books succumb to? Which will
    books resist?

    Books typically have an internal coherence that may prevent their traditional packaging from fracturing along more natural fault lines the way it does with newspapers, magazines and albums. But as the difficulties and costs of publishing continue to fall, and as methods for online billing evolve, publishers or authors may themselves opt to fracture book packaging for economic reasons. It was not long ago that novels were routinely published in serialized form. If it's all digital, why not?

    Because modern forms of publishing are giving end users a voice, it seems a safe bet that books will become living digital objects and that the traditional distinctions between author and reader, and between publisher and consumer, will blur considerably.

    Conceptually, though perhaps not technologically, there's a long way to go. Even the most avant-garde online services are only now contemplating this kind of future. I'm willing to bet that Hugh is also right that publishers' products will have APIs. The API, provided that it allows users and applications to write, can be the vehicle by which a book is alive on the Internet, in the sense that it will allow the contribution of information to books, and make that information actionable.

    Terry Jones will discuss the writeable future of publishing at the next Tools of Change for Publishing Conference (Feb. 14-16, 2011). Save 15% on registration with the code TOC11RAD.

    A world of writeable containers

    Looking at publishing from the broad perspective outlined above, with its clear general convergence and specific trends, I consider it inevitable that books and their publishers will be drawn into a digital future along the lines that Hugh predicts.

    You can look at this more widely, though. Publishing will converge on the usage of underlying information storage that provides for a world of openly writeable containers. You could, for example, build a Twitter-like system on such a basis, providing seamlessly for user annotations. At the other end of the spectrum, you could use this type of writeable system to publish customizable living digital objects -- writeable containers -- representing books (or anything else). VC Fred Wilson lends weight to the claim of convergence toward a more openly writeable world in his blog post, "Giving every person a voice":

    If I look back at my core investment thesis over the past five years, it is this single idea, that everyone has a voice on the Internet, that is central to it. And as Ev [Williams] said, society has not fully realized what this means. But it's getting there, quickly.

    As Brian O'Leary noted in "Context first", mental models and mindset changes are required. Shifting people from read-only thinking to imagining a computational world that is by-default writeable is something I've been trying to pull off for years. (FluidDB, a database we're building at Fluidinfo, is meant to explicitly prepare for the type of future Hugh envisions. Everything in FluidDB can be added to -- tagged -- by anyone or any application. )

    Read-only containers of content are an inherently limiting form of media, whether physical or digital. APIs that provide controlled access to information are similarly limited. They prevent the accumulation of unanticipated or personalized contextual information.

    From one perspective, arguing that this kind of convergence is inevitable may seem like a radical oversimplification or wishful thinking, but from another it seems deadly simple and obvious. In plainest terms, I believe the future of publishing is a writeable one. One in which we step beyond the default of read-only publishing via traditional containers and APIs, to something that's both natural and empowering: a world in which data itself becomes social, and in which we can personalize arbitrarily. In other words, a world in which we always have write permission.


    October 07 2010

    The black market for data

    General economic theory suggests that supply will meet demand when it is sufficient and demand will consume what is supplied when appropriate value exists. Black markets emerge when there is a disturbance in the force between supply and demand.

    Black markets summon thoughts of illicit goods: stolen stereo equipment, weapons, drugs, etc. Rarely do we think about digitized data in a black market. But in today's world of open social media APIs, there's a rift between what publishers consider open versus what data consumers are demanding. Most APIs are wrapped in terms of use that users sign off on ("I Agree") before using the API, and those agreements define how data can be used. Often, those agreements are violated.

    The handy gray area

    I can see Coke's logo whenever I like, but I can't use it however I want. Similarly, companies own user-generated content (UCG) -- or at least the means to access it, and/or the conditions under which it can ultimately be displayed. Users, developers and others can see all that information, but it's not free for the taking. Here-in lies the conflict and the impetus for black markets dealing in data.

    When your business needs a certain amount or type of data that it can't legitimately have -- as dictated by the terms of service (TOS) of the given set of APIs you need it from -- a gray area comes in handy. You might not participate in a TOS-violation yourself, but you could pay someone else to do it and hand over the results. Some forms of page scraping, multi-account/access-key (token), API access aggregating (IP farms), and re-syndication are violations of a services' TOS. Ask your data providers how they get their data. You may be surprised at what you hear.

    Black market data consumption generally occurs behind closed doors, so the economic impact is hard to define and understand. In addition, most data publishers/services simply don't understand the commercial data market -- or its opportunity size -- and therefore don't bother with enforcement of their TOS.

    But perspectives change when end-user privacy concerns come to the fore. Publishers engage with full force, taking technical and legal action to shut off various data sources. A recent example: Facebook realized their robots.txt map was promoting page scraping, so they changed the file accordingly and sent cease and desist letters to the offending parties. This interrupted the black market supply of large sets of Facebook data, and various data providers weren't able to deliver their product to their customers. It had the ripple effect of of a large-scale drug bust.

    Defined value will challenge the black market

    Scenarios such as these suggest a wild-west attitude pervades the business of data. Publishers feel the need to make data available, and clearly folks want to consume it, yet the use cases around how and when said data wants to be used are still highly variable. Data API publishers and consumers are still trying to understand the value of their data in a highly tumultuous industry, which makes everything feel like a one-off at the moment.

    Within the next year I predict more clarity in the API terms that major data sources convey, and a more common understanding of the "rules of the road." I also believe the value of a services' underlying data will emerge. We'll know the value of a Tweet or a given Facebook status message. Of course, no two messages are alike. A spam message is worthless, whereas a message from a celebrity endorsing a product is highly valuable. Once mechanisms exist for pricing these actions, a formal and granular marketplace can emerge to meet demand, and black markets can dissipate.

    Keep data available

    Despite black markets and TOS violations, it's important for publishers to continue to make their data widely available. Publishers get the public benefit of being labeled as open, as opposed to proprietary. They also effectively outsource many of the hard technical challenges and business models to developers who want to build products based on their data.

    Consumers (i.e., companies and developers) that consume a publisher's data APIs benefit by being able to leverage someone else having solved the hard problem of amassing an interesting, and often large, user base. Consumers are able to repackage and shift the underlying data in a manner that leads to financial gain (and/or application cool-factor fame). Generally the publishers only get a subset of use cases right, and plenty of value is left on the table. Data API consumers exist to identify and take advantage of that value.

    End-users are unfortunately stuck in the middle. We benefit from cool applications being built on the raw data we add to a publisher's product/platform, but we run the risk of being burned by the "I Agree" checkbox. The publisher is in the game to make money, and the content (data) we provided them has inherent behavioral value. The result is experimentation with how that data can be used in a relatively open marketplace. Let's hope for the best!


    September 07 2010

    The state of mapping APIs

    Guest blogger Adam DuVander is the author of "Map Scripting 101," an example-driven guide to building interactive maps on multiple platforms. He also serves as executive editor of ProgrammableWeb.

    Maps took over the web in mid-2005, shortly after the first Where 2.0 conference. They quickly moved from fancy feature to necessary element of any site that contained even a trace of geographic content. Today we're amidst another location and mapping revolution, with mobile making its impact on the web. And with it, we're seeing even more geo services provided by both the old guard and innovative new mapping platforms.

    Though the map itself continues to be important, other geographic data is having a larger impact. Providers are making this data, such as driving directions and business listings, available in increasingly open ways.

    The old guard

    This screen is from Marcelo Montagna's Custom Tile Layers with Opacity Google Maps demo.
    A screenshot from Marcelo Montagna's "Custom Tile Layers with Opacity" Google Maps demo.
    Google had the first mapping API and continues to keep its lead by adding useful new features. The company's Maps V3 was originally optimized for mobile, but in May Google made it the go-to platform for the web as well. With this move, Google showed that the mobile web is at least as important as the web we access from our homes and offices.

    Another sign of mobile's influence on mobile appears in how Google is making some of its newest services available. Geocoding, driving directions and business listings are no longer confined to access via JavaScript. Instead, Google has made these all available via web services, giving developers the freedom to use the results in multiple ways, such as in a native smart phone application.

    Yahoo has done little to expand its mapping platform in recent years, though it is almost as old as Google's. Due to Yahoo's more lenient terms, its auxiliary geo services, such as geocoding and static maps, get consistent interest from developers. And the company has made improvements, such as its next-generation geocoder, PlaceFinder, which it announced in June.

    Yet, with Yahoo's tremendous potential, the mapping platform remains untouched. There's hope, given the recent deal with Nokia to provide maps on Yahoo proper. Both Yahoo and Nokia are mum on whether the deal will extend to Yahoo's developer platform, which makes me wonder if it will leave behind an industry it helped create.

    MapQuest is oft-forgotten by developers, though it has made some of the largest strides with its mapping platform in the last year. A sixth version of its JavaScript API, written from the ground up, recently came out of beta. The new platform takes advantage of the web services that MapQuest has released. It's an attempt to make the thin client code even thinner.

    One of these web services, Directions, made MapQuest a leader among mapping APIs. Launched a year ago, Directions marked the first time routing was available for free without being constricted with JavaScript-only access. Most recently MapQuest made the same service available built on top of OpenStreetMap data.

    The newcomers

    Bing may seem like a strange newcomer, since Microsoft has had a mapping API for some time. Previously called Virtual Earth, it was re-branded in 2009 along with the launch of Microsoft's new search engine. But it's not just surface-level changes. Microsoft has continued to launch new developer services with Bing.

    In addition to the JavaScript SDK, Bing Maps can also be created with Silverlight, which makes for smoother transitions and animation. The Bing Maps site itself runs on Silverlight, and in June Microsoft launched the ability to create map apps, which can run on the main Bing Maps site.

    CloudMade is a company built upon OpenStreetMap, the project creating a wiki-like map that anyone can edit. Using this open data, CloudMade's API gives you access to the Open Street Map tiles in a way that is more reliable -- and style-able -- than the project itself.

    CloudMade's Map Style Editor lets you set colors for features, such as roads and parks. Then, make your own style available for embedding using the JavaScript API. CloudMade supplies much of the same power that super-users have when making map tiles server-side in a point-and-click interface.

    Where have all the hackers gone?

    With so many official mapping APIs available, it's easy to forget that the map mashup culture was founded upon hacking. Paul Rademacher created HousingMaps to show Craigslist rentals and homes for sale on a Google Map before Google had an API. Adrian Holovaty made Chicago Crime to show crime data (which he scraped from the police bureau's website) on a hacked and embedded Google Map.

    Rademacher joined Google in 2005, created the Google Earth plugin and now is part of the team that makes Google Maps. Holovaty's Chicago Crime project became part of EveryBlock, a local news aggregator that sold to MSNBC last year. Ironically, EveryBlock doesn't use any mapping API, instead opting for using its own minimalist map tiles.

    The mapping hackers of 2010 have also gone server-side, away from the APIs. Using tools like Mapnik, they're styling their own maps, almost always with OpenStreetMap data. Sometimes it's for fun, like Brett Camper's 8-Bit City. And when an earthquake struck Haiti, map hackers responded.

    Mapping the future

    Mapping providers will likely make it easier to create your own customized maps. Already Google Maps V3 has simple styling via CSS-like code. And the process of creating OpenStreetMap tiles is greatly simplified by Tile Drawer.

    But it's not just making the map itself that needs simplification, but storing and accessing the data on top of it. For years developers have had to set up their own databases of locations, which raises the bar for the type of developer who can use maps. Now there are tools like SimpleGeo to make the process easier. However, it would be useful to see these tools baked into the mapping APIs and we likely will soon.

    Similarly, we need easier ways of expressing data without just adding more markers. Graphic overlays, such as choropleths (regions shaded based on data) and heatmaps, are not accessible to most developers. The processes need to run on a server capable of geo-referencing the graphic it outputs. And services available to do this tend to charge. The open government movement is already tied closely to mapping. Hopefully projects for the greater good will fill in feature gaps where mapping providers don't see business opportunities.

    Obviously, mobile will play a huge role in the future of mapping. Already we've seen an impact, yet there are far fewer sites taking advantage of the user's location than could. Expect the next generation of store locators, for example, to be much more exciting. But that's just the beginning.


    August 27 2010

    Four short links: 27 August 2010

    1. Working Audio Data Demos -- the new Firefox has a very sweet audio data API and some nifty demos like delay pedals, a beat detector (YouTube) and a JavaScript text-to-speech generator. (via jamesaduncan on Twitter)
    2. Estimating the Economic Impact of Mass Digitization Projects on Copyright Holders: Evidence from the Google Book Search Litigation -- [T]he revenues and profits of the publishers who believe themselves to be most aggrieved by GBS, as measured by their willingness to file suit against Google for copyright infringement, increased at a faster rate after the project began, as compared to before its commencement. The rate of growth by publishers most affected by GBS is greater than the growth of the overall U.S. economy or of retail sales.
    3. In History-Rich Region, a Very New System Tracks Very Old Things (NY Times) -- Getty built a web database to help Jordan track its antiquities sites (and threats to them) with Google Earth satellite images. (via auchmill on Twitter)
    4. What Women Want and How Not to Give it To Them -- thought-provoking piece about the ways in which corporate diversity efforts fail. Must read.

    August 18 2010

    Four short links: 18 August 2010

    1. BBC Dimensions -- brilliant work, a fun site that lets you overlay familiar plcaes with famous and notable things so you can get a better sense of how large they are. Example: the Colossus of Rhodes straddling O'Reilly HQ, the Library of Alexandria vs the Google campus, and New Orleans Mardi Gras began at the headquarters of Fred Phelps's Westboro Baptist Church. (via this piece about its background)
    2. Podapter -- simple plug that takes mini-USB and goes into an iPod or iPhone. (via Tuesday product awesomeness)
    3. New NexusOne Radio Firmware -- a glimpse of the world that's sprung up sharing the latest goodies between countries, carriers, and developers. For everyone for whose products the street has found a new use, the challenge is to harness this energy, enthusiasm, knowledge, and devotion. In terms of cognitive surplus, this far exceeds the 1 LOLCAT minimum standard unit. (via YuweiWang on Twitter)
    4. Echoes Nest Remix API -- access to database of song characteristics and tools to manipulate tunes. See the Technology Review article for examples of what it's capable of. (via aaronsw on Twitter)

    July 27 2010

    Data as a service

    The last two months have seen some important developments in the way data is made available. First, Infochimps created a web API for publishing data. The number of datasets is relatively limited; there are five available now, of which four have to do with Twitter data, and one maps IP addresses to census data (and that one appears not to be available yet). Their site allows you to request (or vote on requests) for new datasets. Pricing is reasonable. You can do significant experimentation, or even run a useful low-volume application, without running up any charges.

    "Data as a service" is not a new term, by any means. There have been any number of data services over the years. But this is something different from the many services that have sold data -- or even the more recent services that have sold data via the Internet. Data as a service is another part of the cloud computing alphabet soup, on par with "infrastructure, software, or platform as a service" (IaaS/SaaS/PaaS). Infochimps makes possible applications where data lives in the cloud. Granted, you're not going to access terabyte datasets over the Internet. But neither do you have to download (or have shipped) a giant dataset for the few Kilo- or Megabytes that interest you. Infochimps is pushing a bit beyond simple data access. Their Twitter APIs aren't raw data, but implement trust metrics, influence metrics, and more. So perhaps it's better to call this "algorithm as a service" (AaaS), not unlike the Prediction API (machine learning using Google's algorithms) that was announced at Google I/O.

    The second new data service that has impressed me is Google's new Public Data Explorer. I assume that everyone reading this article has seen the latest spectacular data visualizations, in the New York Times, Nathan Yau's Flowingdata blog, and elsewhere. Here's one example from GE (created by Ben Fry's Fathom Information Design). Public Data Explorer lets you create your own visualizations, based on Google's data.

    Here's one of their examples (nicer than anything I came up with on the fly). It's an animation of per-capita income in California counties that shows how the individual counties have fared from 1969 to 2007. I've highlighted a few interesting counties -- let's see how they perform:

    Not surprisingly, the difference between the richest and poorest counties has drastically increased. Google provides many datasets, and gives you interesting ways to arrange and animate the data. I've displayed a fairly simple bar graph animation, but you can also do bubbles on a map and several kinds of Cartesian plots. You can slice and dice regions in many different ways, frequently down to the county level. They've got data from the European community, from Australia, the World Bank, and other sources. None of this is exactly new: the data has been around for years. What Google Data Explorer does is enable you to explore the data yourself and paste the result into your own sites and blogs.

    Wolfram Alpha's Widgets provides yet another way to interact with data -- potentially the most flexible yet. (You'll have to create an account and sign in.) Widgets are web components for interacting with Wolfram Alpha's data back-end. You can do pure Mathematica queries, but you can also interact with the extensive data Wolfram has been collecting. There's no programming required (unless you want to submit Mathematica queries). There's a web-based widget builder where you start with an Alpha request, like "US Unemployment," parameterize it, specify the layout you want and how you want to embed it (lightbox, popup, and iFrame styles are supported), and finally test it. At the end, you get a link or a clump of Javascript that you can paste into a website or a blog, and you can include the widget in a public gallery. You can also post directly to Facebook, Twitter, and most other social sites.

    Here's an example: a simple widget to compare two stocks. You can select your own stocks or just use my defaults (Apple and Google):

    It took me about five minutes to whip up this widget, starting with the simple Alpha query "APPL GOOG." But there are many ways to look up stock prices and histories. What about something more esoteric? Alpha knows an incredible amount. The other day my wife and I couldn't remember what a half-diminished seventh chord was. Alpha knows, and can show you a piano keyboard, guitar fingerings, and even play the chord. Here's the result:

    You'll confuse it if you try really odd chords; don't try fancy jazz ninths and thirteenths, and remember to specify "triad" if you want your basic three-note chord.

    Alpha's weak point is that you frequently end up playing "guess what Alpha wants." I suppose that's what you trade off for flexibility; but it was surprisingly difficult to build an interest calculator. Building the widget was simple enough, but coming up with the initial query was difficult. Alpha would either assume I was paying off a loan, or doing a present value calculation, or something else, until I juggled the terms into the right order, which happened to be "10% interest $100 initial value 7 years." Not illogical, but neither were any of the other attempts.

    That's a minor problem, though. Widgets makes it fun to explore data and computation, and makes it trivial to share the results. With "data as a service" APIs like Infochimps, and embeddable data components like Google Public Data Explorer and WolframAlpha Widgets, we're seeing the democratization of data and data visualization: new ways to access data, new ways to play with data, and new ways to communicate the results to others.


    May 28 2010

    Four short links: 28 May 2010

    1. The Intuition Behind the Fisher-Yates Shuffle -- this is a simple algorithm to randomize a list of things, but most people are initially puzzled that it is more efficient than a naive shuffling algorithm. This is a nice explanation of the logic behind it.
    2. Wikipedia and Inherent Open Source Bias -- a specific case of what I think of as the Firefly Principle: what happens on the Internet isn't representative of real life.
    3. Malaysian Public Sector Open Source Program -- the Malaysian government is a heavy and successful user of open source.
    4. Guardian's Platform Now Open for Business (GigaOm) -- elegant summary breakdown of services from the Guardian: metadata for free, content if you pay, custom APIs and applications if you pay more. I'm interested to see how well this works, given that the newspaper business is struggling to find a business model that values content.

    May 25 2010

    Four short links: 25 May 2010

    1. Lending Merry-Go-Round -- these guys have been Australia's sharpest satire for years, filling the role of the Daily Show. Here they ask some strong questions about the state of Europe's economies ... (via jdub on Twitter)
    2. What's Powering the Guardian's Content API -- Scala and Solr/Lucene on EC2 is the short answer. The long answer reveals the details of their setup, including some of their indexing tricks that means Solr can index all their content in just an hour. (via Simon Willison)
    3. What I Learned About Engineering from the Panama Canal (Pete Warden) -- I consider myself a cheerful pessimist. I've been through enough that I know how steep the odds of success are, but I've made a choice that even a hopeless fight in a good cause is worthwhile. What a lovely attitude!
    4. Mapping the Evolution of Scientific Fields (PLoSone) -- clever use of data. We build an idea network consisting of American Physical Society Physics and Astronomy Classification Scheme (PACS) numbers as nodes representing scientific concepts. Two PACS numbers are linked if there exist publications that reference them simultaneously. We locate scientific fields using a community finding algorithm, and describe the time evolution of these fields over the course of 1985-2006. The communities we identify map to known scientific fields, and their age depends on their size and activity. We expect our approach to quantifying the evolution of ideas to be relevant for making predictions about the future of science and thus help to guide its development.

    May 24 2010

    Four short links: 24 May 2010

    1. Google Documents API -- permissions, revisions, search, export, upload, and file. Somehow I had missed that this existed.
    2. Profile of Wikileaks Founder Julian Assange (Sydney Morning Herald) -- he draws no salary, is constantly on the move, lived for a while in a compound in Nairobi with other NGOs, and cowrote the rubberhose filesystem which offers deniable encryption.
    3. OpenPCR -- producing an open design for a PCR machine. PCR is how you take a single piece of DNA and make lots of copies of it. It's the first step in a lot of interesting bits of molecular biology. They're using Ponoko to print the cases. (via davetenhave on Twitter)
    4. Metric Mania (NY Times) -- The problem isn’t with statistical tests themselves but with what we do before and after we run them. First, we count if we can, but counting depends a great deal on previous assumptions about categorization. Consider, for example, the number of homeless people in Philadelphia, or the number of battered women in Atlanta, or the number of suicides in Denver. Is someone homeless if he’s unemployed and living with his brother’s family temporarily? Do we require that a women self-identify as battered to count her as such? If a person starts drinking day in and day out after a cancer diagnosis and dies from acute cirrhosis, did he kill himself? The answers to such questions significantly affect the count. We can never be reminded enough that the context for data must be made as open as the data. To do otherwise is to play Russian Roulette with the truth.

    May 05 2010

    Four short links: 5 May 2010

    1. Sketch for Processing -- an IDE for Processing based on Mozilla's Bespin.
    2. British Election Results to be Broadcast on Big Ben -- the monument is the message. Lovely integration of real-time data and architecture, an early step for urban infrastructure as display.
    3. API -- an alpha API for face recognition.
    4. Average Number of Books/Kindle -- short spreadsheet figuring out, from cited numbers. (Spoiler: the answer is 27)

    May 03 2010

    Four short links: 3 May 2010

    1. Science Hack Day -- Saturday, June 19th and Sunday, June 20th, 2010, in the Guardian offices in London. A meeting place for the designer/coder class and scientists, with datasets as the common language. (via timoreilly on Twitter)
    2. Facebook's Evil Interface (EFF) -- Facebook's new M.O. is to say "to better help you, we took away your privacy. If you are stupid and wish to attempt to retain your privacy, don't not avoid to fail to click here. Now click here. Now click here ... ha, moved it! Moved it again! Gotcha!". Attempting to use Facebook to talk to friends without having your friendships and interests pimped to the data mining Johns is as hard as canceling an AOL subscription.
    3. Make Your Own 3G Router -- an easter-egg inside the new Chumby model (which O'Reilly AlphaTech Ventures invested in).
    4. Australian Government's Response to the Web 2.0 Taskforce -- it's all positive: all but one recommendation accepted. Another very positive step from the Aussies.

    Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
    Could not load more posts
    Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
    Just a second, loading more posts...
    You've reached the end.
    Get rid of the ads (sfw)

    Don't be the product, buy the product!