Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

January 25 2013

Four short links: 25 January 2013

  1. How to Write a Good Bio (Scott Berkun) — something we all have to do, and rarely do well the first time. Excellent advice.
  2. Scumbag Steve’s Advice for Annoying Facebook GirlSome people can’t distinguish the internet from real life. There are people who refuse to believe my name isn’t Steve and that I am not really the scumbag (well not all the time, that is). Just remember who you are. And that you know you’re a decent kid. Blake (the guy whose image was adopted as “Scumbag Steve” by meme-makers) was 21 when he wrote that, and it remains the best advice for anyone dealing with sudden visibility in the public eye.
  3. The Battle for Obama’s Tech (The Verge) — same old story: the software that got Obama elected won’t be released. Instead it’ll atrophy and have to be rewritten in four years’ time. How do I know this? The morons at the Democratic Party did it with Kerry’s run and again for Obama’s first campaign. It’s a choice the OFA developers warn could not only squander the digital advantage the Democrats now hold, but also severely impact their ability to recruit top tech talent in the future.
  4. Precog Software (Wired) — researchers assembled a dataset of more than 60,000 crimes, including homicides, then wrote an algorithm to find the people behind the crimes who were more likely to commit murder when paroled or put on probation. Berk claims the software could identify eight future murderers out of 100. The software parses about two dozen variables, including criminal record and geographic location. The type of crime and the age at which it was committed, however, turned out to be two of the most predictive variables. [...] The software aims to replace the judgments parole officers already make based on a parolee’s criminal record and is currently being used in Baltimore and Philadelphia. I look forward to the study comparing human judgement from parole officers against algorithmic judgement.

January 06 2013

Mark Twain on influence

In 1905 Mark Twain wrestled with the sort of request that many readers here have undoubtedly encountered: a new writer with the most tenuous of connections (her uncle was briefly a neighbor in a Nevada mining town) asks Twain to use his influence to get  her manuscript published.

It never hurts to carry an introduction from a well-regarded intermediary, as long as your introducer can actually speak to the quality of your work. I think of Twain’s anguished reply every time I’m asked to recommend someone or something I don’t know — or am tempted to ask the same favor of someone else.

Twain’s message is ultimately optimistic: don’t simply try to accumulate influence. Instead, come up with a good idea and sell it on its merits. The world will listen.

The full text of Twain’s essay is below, via Project Gutenberg.

A HELPLESS SITUATION

Once or twice a year I get a letter of a certain pattern, a pattern that never materially changes, in form and substance, yet I cannot get used to that letter—it always astonishes me. It affects me as the locomotive always affects me: I saw to myself, “I have seen you a thousand times, you always look the same way, yet you are always a wonder, and you are always impossible; to contrive you is clearly beyond human genius—you can’t exist, you don’t exist, yet here you are!”

I have a letter of that kind by me, a very old one. I yearn to print it, and where is the harm? The writer of it is dead years ago, no doubt, and if I conceal her name and address—her this-world address—I am sure her shade will not mind. And with it I wish to print the answer which I wrote at the time but probably did not send. If it went—which is not likely—it went in the form of a copy, for I find the original still here, pigeonholed with the said letter. To that kind of letters we all write answers which we do not send, fearing to hurt where we have no desire to hurt; I have done it many a time, and this is doubtless a case of the sort.

THE LETTER

X———, California, JUNE 3, 1879.

Mr. S. L. Clemens, Hartford, Conn.:

Dear Sir,—You will doubtless be surprised to know who has presumed to write and ask a favor of you. Let your memory go back to your days in the Humboldt mines—’62-’63. You will remember, you and Clagett and Oliver and the old blacksmith Tillou lived in a lean-to which was half-way up the gulch, and there were six log cabins in the camp—strung pretty well separated up the gulch from its mouth at the desert to where the last claim was, at the divide. The lean-to you lived in was the one with a canvas roof that the cow fell down through one night, as told about by you in Roughing It—my uncle Simmons remembers it very well. He lived in the principal cabin, half-way up the divide, along with Dixon and Parker and Smith. It had two rooms, one for kitchen and the other for bunks, and was the only one that had. You and your party were there on the great night, the time they had dried-apple-pie, Uncle Simmons often speaks of it. It seems curious that dried-apple-pie should have seemed such a great thing, but it was, and it shows how far Humboldt was out of the world and difficult to get to, and how slim the regular bill of fare was. Sixteen years ago—it is a long time. I was a little girl then, only fourteen. I never saw you, I lived in Washoe. But Uncle Simmons ran across you every now and then, all during those weeks that you and party were there working your claim which was like the rest. The camp played out long and long ago, there wasn’t silver enough in it to make a button. You never saw my husband, but he was there after you left, and lived in that very lean-to, a bachelor then but married to me now. He often wishes there had been a photographer there in those days, he would have taken the lean-to. He got hurt in the old Hal Clayton claim that was abandoned like the others, putting in a blast and not climbing out quick enough, though he scrambled the best he could. It landed him clear down on the train and hit a Piute. For weeks they thought he would not get over it but he did, and is all right, now. Has been ever since. This is a long introduction but it is the only way I can make myself known. The favor I ask I feel assured your generous heart will grant: Give me some advice about a book I have written. I do not claim anything for it only it is mostly true and as interesting as most of the books of the times. I am unknown in the literary world and you know what that means unless one has some one of influence (like yourself) to help you by speaking a good word for you. I would like to place the book on royalty basis plan with any one you would suggest.

This is a secret from my husband and family. I intend it as a surprise in case I get it published.

Feeling you will take an interest in this and if possible write me a letter to some publisher, or, better still, if you could see them for me and then let me hear.

I appeal to you to grant me this favor. With deepest gratitude I think you for your attention.

One knows, without inquiring, that the twin of that embarrassing letter is forever and ever flying in this and that and the other direction across the continent in the mails, daily, nightly, hourly, unceasingly, unrestingly. It goes to every well-known merchant, and railway official, and manufacturer, and capitalist, and Mayor, and Congressman, and Governor, and editor, and publisher, and author, and broker, and banker—in a word, to every person who is supposed to have “influence.” It always follows the one pattern: “You do not know me, but you once knew a relative of mine,” etc., etc. We should all like to help the applicants, we should all be glad to do it, we should all like to return the sort of answer that is desired, but—Well, there is not a thing we can do that would be a help, for not in any instance does that letter ever come from anyone who can be helped. The struggler whom you could help does his own helping; it would not occur to him to apply to you, stranger. He has talent and knows it, and he goes into his fight eagerly and with energy and determination—all alone, preferring to be alone. That pathetic letter which comes to you from the incapable, the unhelpable—how do you who are familiar with it answer it? What do you find to say? You do not want to inflict a wound; you hunt ways to avoid that. What do you find? How do you get out of your hard place with a contend conscience? Do you try to explain? The old reply of mine to such a letter shows that I tried that once. Was I satisfied with the result? Possibly; and possibly not; probably not; almost certainly not. I have long ago forgotten all about it. But, anyway, I append my effort:

THE REPLY

I know Mr. H., and I will go to him, dear madam, if upon reflection you find you still desire it. There will be a conversation. I know the form it will take. It will be like this:

MR. H. How do her books strike you?

MR. CLEMENS. I am not acquainted with them.

H. Who has been her publisher?

C. I don’t know.

H. She has one, I suppose?

C. I—I think not.

H. Ah. You think this is her first book?

C. Yes—I suppose so. I think so.

H. What is it about? What is the character of it?

C. I believe I do not know.

H. Have you seen it?

C. Well—no, I haven’t.

H. Ah-h. How long have you known her?

C. I don’t know her.

H. Don’t know her?

C. No.

H. Ah-h. How did you come to be interested in her book, then?

C. Well, she—she wrote and asked me to find a publisher for her, and mentioned you.

H. Why should she apply to you instead of me?

C. She wished me to use my influence.

H. Dear me, what has influence to do with such a matter?

C. Well, I think she thought you would be more likely to examine her book if you were
influenced.

H. Why, what we are here for is to examine books—anybody’s book that comes along. It’s our business. Why should we turn away a book unexamined because it’s a stranger’s? It would be foolish. No publisher does it. On what ground did she request your influence, since you do not know her? She must have thought you knew her literature and could speak for it. Is that it?

C. No; she knew I didn’t.

H. Well, what then? She had a reason of some sort for believing you competent to recommend her literature, and also under obligations to do it?

C. Yes, I—I knew her uncle.

H. Knew her uncle?

C. Yes.

H. Upon my word! So, you knew her uncle; her uncle knows her literature; he endorses it to you;
the chain is complete, nothing further needed; you are satisfied, and therefore—

C. No, that isn’t all, there are other ties. I know the cabin her uncle lived in, in the mines; I knew his partners, too; also I came near knowing her husband before she married him, and I did know the abandoned shaft where a premature blast went off and he went flying through the air and clear down to the trail and hit an Indian in the back with almost fatal consequences.

H. To him, or to the Indian?

C. She didn’t say which it was.

H. (With a sigh). It certainly beats the band! You don’t know her, you don’t know her literature, you don’t know who got hurt when the blast went off, you don’t know a single thing for us to build an estimate of her book upon, so far as I—

C. I knew her uncle. You are forgetting her uncle.

H. Oh, what use is he? Did you know him long? How long was it?

C. Well, I don’t know that I really knew him, but I must have met him, anyway. I think it was that way; you can’t tell about these things, you know, except when they are recent.

H. Recent? When was all this?

C. Sixteen years ago.

H. What a basis to judge a book upon! As first you said you knew him, and now you don’t know whether you did or not.

C. Oh yes, I know him; anyway, I think I thought I did; I’m perfectly certain of it.

H. What makes you think you thought you knew him?

C. Why, she says I did, herself.

H. She says so!

C. Yes, she does, and I did know him, too, though I don’t remember it now.

H. Come—how can you know it when you don’t remember it.

C. I don’t know. That is, I don’t know the process, but I do know lots of things that I don’t remember, and remember lots of things that I don’t know. It’s so with every educated person.

H. (After a pause). Is your time valuable?

C. No—well, not very.

H. Mine is.

So I came away then, because he was looking tired. Overwork, I reckon; I never do that; I have seen the evil effects of it. My mother was always afraid I would overwork myself, but I never did.

Dear madam, you see how it would happen if I went there. He would ask me those questions, and I would try to answer them to suit him, and he would hunt me here and there and yonder and get me embarrassed more and more all the time, and at last he would look tired on account of overwork, and there it would end and nothing done. I wish I could be useful to you, but, you see, they do not care for uncles or any of those things; it doesn’t move them, it doesn’t have the least effect, they don’t care for anything but the literature itself, and they as good as despise influence. But they do care for books, and are eager to get them and examine them, no matter whence they come, nor from whose pen. If you will send yours to a publisher—any publisher—he will certainly examine it, I can assure you of that.

December 31 2012

Saving publishing, one tweet at a time

Traffic comes to online publishers in two ways: search and social. Because of this, writing for the tweet is a new discipline every writer and editor must learn. You’re not ready to publish until you find the well crafted headline that fits in 100 characters or so, and pick an image that looks great shared at thumbnail size on Facebook and LinkedIn.

But what of us, the intelligent reader? Nobody wants to look like a retweet bot for publishers. The retweet allows us no space to say why we ourselves liked an article.

Those of us with time to dedicate are familiar with crafting our own awkward commentaries: “gr8 insight in2 state of mob,” “saw ths tlk last Feb,” “govt fell off fiscal clf”. Most of the time it’s easier just to bookmark, or hit “read later,” and not put in the effort to share.

Rescue is at hand. The writer and programmer Paul Ford has created a bookmarklet, entitled Save Publishing. On activating the bookmarklet while viewing an article you wish to share, it highlights and makes clickable all the tweetable phrases from the page.

Presto! A quick way to share what you like from a piece without having to think too hard: as a bonus, it makes you look intelligent and as if you read the entire article.

"Save Publishing" highlights the tweetable phrases in an article"Save Publishing" highlights the tweetable phrases in an article

“Save Publishing” highlights the tweetable phrases in an article

Why will this simple bookmarklet really save publishing? Not singlehandedly, for sure, but anything that helps readers express what they like and share with each other is a boon to publishers and readers alike. Think of Save Publishing as Kindle’s highlight feature, writ large for the web.

Ford writes that Save Publishing started as a joke, and “now it’s serious and I use it all day.” I’ve certainly enjoyed it, enough to contribute a little code to the project myself. Best of all, it’s saving me from writing tweets for my own pieces!

December 26 2011

Four short links: 26 December 2011

  1. Pattern -- a BSD-licensed bundle of Python tools for data retrieval, text analysis, and data visualization. If you were going to get started with accessible data (Twitter, Google), the fundamentals of analysis (entity extraction, clustering), and some basic visualizations of graph relationships, you could do a lot worse than to start here.
  2. Factorie (Google Code) -- Apache-licensed Scala library for a probabilistic modeling technique successfully applied to [...] named entity recognition, entity resolution, relation extraction, parsing, schema matching, ontology alignment, latent-variable generative models, including latent Dirichlet allocation. The state-of-the-art big data analysis tools are increasingly open source, presumably because the value lies in their application not in their existence. This is good news for everyone with a new application.
  3. Playtomic -- analytics as a service for gaming companies to learn what players actually do in their games. There aren't many fields untouched by analytics.
  4. Write or Die -- iPad app for writers where, if you don't keep writing, it begins to delete what you wrote earlier. Good for production to deadlines; reflective editing and deep thought not included.

November 10 2011

Four short links: 10 November 2011

  1. Steve Case and His Companies (The Atlantic) -- Maybe you see three random ideas. Case and his team saw three bets that paid off thanks to a new Web economy that promotes power in numbers and access over ownership. "Access over ownership" is a phrase that resonated. (via Walt Mossberg)
  2. Back to the Future -- teaching kids to program by giving them microcomputers from the 80s. I sat my kids down with a C64 emulator and an Usborne book to work through some BASIC examples. It's not a panacea, but it solves a lot of bootstrapping problems with teaching kids to program.
  3. Replaying Writing an Essay -- Paul Graham wrote an essay using one of his funded startups, Stypi, and then had them hack it so you could replay the development with the feature that everything that was later deleted is highlighted yellow as it's written. The result is fascinating to watch. I would like my text editor to show me what I need to delete ;)
  4. Jawbone Live Up -- wristband that sync with iPhone. Interesting wearable product, tied into ability to gather data on ourselves. The product looks physically nice, but the quantified self user experience needs the same experience and smoothness. Intrusive ("and now I'm quantifying myself!") limits the audience to nerds or the VERY motivated.

November 03 2011

How I automated my writing career

In 2001, I got an itch to write a book. Like many people, I naïvely thought, "I have a book or two in me," as if writing a book is as easy as putting pen to paper. It turns out to be very time consuming, and that's after you've spent countless hours learning and researching and organizing your topic of choice. But I marched on and wrote or co-wrote 10 books in a five-year period. I'm a glutton for punishment.

My day job during that time was programming. I've been programming for 16 years. My whole career I've focused on automating the un-automatable — essentially making computers do things people never thought they could do. By the time I started on my 10th book, I got another kind of itch — I wanted to automate my writing career. I was getting bored with the tedium of writing books, and the money wasn't that good.

But that's absurd, right? How can a computer possibly write something coherent and informative, much less entertaining? The "how can a computer possibly do X?" questions are the ones I've spent my career trying to answer. So, I set out on a quest to create software that could write. It took more effort than writing 10 books put together, but after building a team of 12 people, we were able to use our software to generate more than 100,000 sports-related stories in a nine-month period.

Before I get into specifics with what our software produces, I think it's worth highlighting some of the attributes that make software a great candidate to be a writer:

  • Software doesn't get writer's block, and it can work around the clock.
  • Software can't unionize or file class-action lawsuits because we don't pay enough (like many of the content farms have had to deal with).
  • Software doesn't get bored and start wondering how to automate itself.
  • Software can be reprogrammed, refactored and improved — continuously.
  • Software can benefit from the input of multiple people. This is unlike traditional writing, which tends to be a solitary event (+1 if you count the editor).
  • Perhaps most importantly, software can access and analyze significantly more data than what a single person (or even a group of people) can do on their own.

Software isn't a panacea, though. Not all content can be easily automated (yet). The type of content my company, Automated Insights, has automated is quantitatively oriented. That's the trick. We've automated content by applying meaning to numbers, to data. Sports was the first category we tackled. Sports by their nature are very data heavy. By our internal estimates, 70% of all sports-related articles are analyzing numbers in one form or another.

Our technology combines a large database of structured data, a real-time feed of stats, and a large database of phrases, and algorithms to tie it all together to produce articles from two to eight paragraphs in length. The algorithms look for interesting patterns in the data to determine what to write about.

In November of 2010, we launched the StatSheet Network, a collection of 345 websites (one for every Division-I NCAA Basketball team) that were fully automated. Check out my favorite team: UNC Tar Heels.

Automated game recap
Software mines data to construct short game recaps. (Click to see full story.)

We included the typical kind of stats you'd expect on a basketball site, but also embedded visualizations and our fully automated articles. We automated 14 different types of stories, everything from game recaps and previews to players of the week and historical retrospectives. Recently, we launched similar sites for every MLB team (check out the Detroit Tigers site), and soon we are launching sites for every NFL and NCAA Football team.

Sports is only one of many different categories we are working on. We've also done work in finance, real estate and a few other data-intensive industries. However, don't limit your thinking on what's possible. We get a steady stream of requests from non-obvious industries, such as pharmaceutical clinical trials and even domain name registrars. Any area that has large datasets where people are trying to derive meaning from the data are potential candidates for our technology.

Automation plus human, not automation versus human

Creating software that can write long-form narratives is very difficult, full of all sorts of interesting artificial intelligence, machine learning and natural language problems. But with the right mix of talent (and funding), we've been able to do it. It really does take a keen understanding of how software and the written word can work together.

I often hear it suggested that software-generated prose must be very bland and stilted. That's only the case if the folks behind the software write bland and stilted prose. Software can be just as opinionated as any writer.

A common, and funny, question I get from journalists is: "when will you automate me out out of a job?" I find the question humorous because built into the question is the assumption that if our software can write the perfect story on a particular topic, then no one else should attempt to write about it. That's just not going to happen. What's happening instead is that media companies are using our software to help scale their businesses. Initially, that takes the form of generating stories on topics a media outlet didn't have the resources to cover. In other cases, it means putting our stories through an editorial process that customizes the content to the specific needs of the publisher. You still need humans for that. There will be less of a need for folks to spend their time writing purely quantitative pieces, but that should be liberating. Now, they can focus on more qualitative, value-added commentary that humans are inherently good at. Quantitative stories can — and probably should — be mostly automated because computers are better at that.

Software will make hyperlocal content possible and even profitable. Many companies have tried to solve the "hyperlocal problem" with minimal success. It's just too hard to scale content creation out to every town in the U.S. (or the world, for that matter). For certain categories (e.g. high school sports), software-generated content makes perfect sense. You'll see automated content play a big role here in the coming years.

Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.

Save 20% on registration with the code RADAR20

Software-generated books?

Because I've been so focused on running Automated Insights, I haven't had time to write any new books recently. I suggested to a colleague that we should turn our software loose and have it write my next book. He looked at me and asked, "How can it possibly do that?" That's what I like to hear.

But is a software-generated book even feasible? Our software can create eight paragraphs now, but is it possible to create eight chapters' worth of content? The answer is "yes," but not quite the same kind of technical books I used to write, at least right now. It would be easy for us to extend our technology to write even longer pieces. That's not the issue. Our software is good at quantitative analysis using structured data.

The kind of books I used to write were not based on data and were qualitative in nature. I pulled from my experience and did supplemental research, made a judgment on the best way to perform a task, then documented it. We are in the early stages of building software that will do more qualitative analysis such as this, but that's a much harder challenge. The main advantage of today's usage of software writing is to automate repetitive types of content. This is less applicable for books.

In the near term, the writers at O'Reilly and elsewhere have nothing to worry about. But I wouldn't count out automation in the long term.

Associated box score photo on home and category pages via Wikipedia.

Related:

February 03 2011

Four short links: 3 February 2011

  1. Curveship -- a new interactive fiction system that can tell the same story in many different ways. Check out the examples on the home page. Important because interactive fiction and the command-lines of our lives are inextricably intertwined.
  2. Egypt's Revolution: Coming to an Economy Near You (Umair Haque) -- more dystopic prediction, but this phrase rings true: The lesson: You can't steal the future forever — and, in a hyperconnected world, you probably can't steal as much of it for as long.
  3. Why Startups Fail -- failure is a more instructive teacher than success, so simply studying successful startups isn't enough. (via Hacker News)
  4. Computer Science and Philosophy -- Oxford is offering a program studying CS and Philosophy together. the two disciplines share a broad focus on the representation of information and rational inference, embracing common interests in algorithms, cognition, intelligence, language, models, proof, and verification. Computer Scientists need to be able to reflect critically and philosophically about these, as they push forward into novel domains. Philosophers need to understand them within a world increasingly shaped by computer technology, in which a whole new range of enquiry has opened up, from the philosophy of AI, artificial life and computation, to the ethics of privacy and intellectual property, to the epistemology of computer models (e.g. of global warming). I wish every CS student had taken a course in ethics.

September 09 2010

Four short links: 9 September 2010

  1. CloudUSB -- a USB key containing your operating environment and your data + a protected folder so nobody can access you data, even if you lost the key + a backup program which keeps a copy of your data on an online disk, with double password protection. (via ferrouswheel on Twitter)
  2. FCC APIs -- for spectrum licenses, consumer broadband tests, census block search, and more. (via rjweeks70 on Twitter)
  3. Sibyl: A system for large scale machine learning (PDF) -- paper from Google researchers on how to build machine learning on top of a system designed for batch processing. (via Greg Linden)
  4. The Surprisingness of What We Say About Ourselves (BERG London) -- I made a chart of word-by-word surprisingness: given the statement so far, could Scribe predict what would come next?

September 07 2010

Four short links: 7 September 2010

  1. GalaxyZoo for Climate Science? -- GalaxyZoo is the crowdsourced physics research. A group of climate scientists want the same, to help predict "weather events". See also the Guardian article. (via adw_tweets on Twitter)
  2. Crispian's Science Map -- gorgeous Underground-style map showing scientists and their contributions. (via arjenlentz on Twitter)
  3. Programming Things I Wish I Knew Earlier (Ted Dziuba) -- opinionated piece, but boils down to "keep it simple until you can't", and "the more you know about the actual hardware, the better you can code". With EC2, when Amazon says "I/O performance: High", what does that even mean? Is that suitable for a heavy random read scenario? (via Hacker News)
  4. The Molecular Biology Carnival, 2ed -- collection of excellent blog writing about molecular biology. (via BioinfoTools on Twitter)

August 25 2010

Four short links: 25 August 2010

  1. Why Narrative and Structure are Important (Ed Yong) -- Ed looks at how Atul Gawande's piece on death and dying, which is 12,000 words long, is an easy and fascinating read despite the length.
  2. Understanding Science (Berkeley) -- simple teaching materials to help students understand the process of science. (via BoingBoing comments)
  3. Sax: Symbolic Aggregate approXimation -- SAX is the first symbolic representation for time series that allows for dimensionality reduction and indexing with a lower-bounding distance measure. In classic data mining tasks such as clustering, classification, index, etc., SAX is as good as well-known representations such as Discrete Wavelet Transform (DWT) and Discrete Fourier Transform (DFT), while requiring less storage space. In addition, the representation allows researchers to avail of the wealth of data structures and algorithms in bioinformatics or text mining, and also provides solutions to many challenges associated with current data mining tasks. One example is motif discovery, a problem which we recently defined for time series data. There is great potential for extending and applying the discrete representation on a wide class of data mining tasks. Source code has "non-commercial" license. (via rdamodharan on Delicious)
  4. Open Source OSCON (RedMonk) -- The business of selling open source software, remember, is dwarfed by the business of using open source software to produce and sell other services. And yet historically, most of the focus on open source software has accrued to those who sold it. Today, attention and traction is shifting to those who are not in the business of selling software, but rather share their assets via a variety of open source mechanisms. (via Simon Phipps)

June 29 2010

Four short links: 29 June 2010

  1. The Diary of Samuel Pepys -- a remarkable mashup of historical information and literature in modern technology to make the Pepys diaries an experience rather than an object. It includes historical weather, glosses, maps, even an encyclopedia. (prompted by Jon Udell)
  2. The Tonido Plug Server -- one of many such wall-wart sized appliances. This caught my eye: CodeLathe, the folks behind Tonido, have developed a web interface and suite of applications. The larger goal is to get developers to build other applications for inclusion in Tonido’s own app store.
  3. Wikileaks Fails "Due Diligence" Review -- interesting criticism of Wikileaks from Federation of American Scientists. “Soon enough,” observed Raffi Khatchadourian in a long profile of WikiLeaks’ Julian Assange in The New Yorker (June 7), “Assange must confront the paradox of his creation: the thing that he seems to detest most-power without accountability-is encoded in the site’s DNA, and will only become more pronounced as WikiLeaks evolves into a real institution.” (via Hacker News)
  4. Yahoo Style Guide -- a paper book, but also a web site with lots of advice for those writing online.

April 16 2010

Four short links: 16 April 2010

  1. Buckets and Vessels (Aaron Straup Cope) -- amazing collection of projects and the cultural shifts they illustrate. Michal Migurski's Walking Papers, software designed to round-trip paper and digital edits to Open Street Map, has recently been used by professors at the University of California’s Berkeley’s School of Information to enable “a sort of psychogeographical dispute resolution between high school students in the town of Richmond marking up maps of their school and neighbourhood with tags like “stoners”, “asian gangsters” or “make-out spot” (http://groups.ischool.berkeley.edu/papermaps/kennedy.html). By allowing participants to manipulate the perception of their environment they are given a sort of bias knob to adjust the psychics and gravity of one space over another and to create a truly personal map of the world. (via auchmill on Twitter)
  2. Jonathan Ive on Industrial Design -- fascinating to hear him talk about how he approaches his products; the interplay between materials, manufacturing methods, and function.
  3. Hacking Toy EEGs (MindHacks) -- who doesn't want to do this, just based on the title alone?
  4. Mamet's Memo to the Writers -- forceful, clear, and commanding. A tremendous insight, in a short period of time, into what good writing is. No idea why it's in all caps. SOMEONE HAS TO MAKE THE SCENE DRAMATIC. IT IS NOT THE ACTORS JOB (THE ACTORS JOB IS TO BE TRUTHFUL). IT IS NOT THE DIRECTORS JOB. HIS OR HER JOB IS TO FILM IT STRAIGHTFORWARDLY AND REMIND THE ACTORS TO TALK FAST. IT IS YOUR JOB. (via Dan Meyer)

January 07 2010

Pew Research asks questions about the Internet in 2020

Pew Research, which seems to be interested in just about everything,
conducts a "future of the Internet" survey every few years in which
they throw outrageously open-ended and provocative questions at a
chosen collection of observers in the areas of technology and
society. Pew makes participation fun by finding questions so pointed
that they make you choke a bit. You start by wondering, "Could I
actually answer that?" and then think, "Hey, the whole concept is so
absurd that I could say anything without repercussions!" So I
participated in their href="http://www.pewinternet.org/Reports/2006/The-Future-of-the-Internet-II.aspx"
2006 survey and did it again this week. The Pew report will
aggregate the yes/no responses from the people they asked to
participate, but I took the exercise as a chance to hammer home my own
choices of issues.

(If you'd like to take the survey, you can currently visit

http://www.facebook.com/l/c6596;survey.confirmit.com/wix2/p1075078513.aspx

and enter PIN 2000.)

Will Google make us stupid?

This first question is not about a technical or policy issue on the
Internet or even how people use the Internet, but a purported risk to
human intelligence and methods of inquiry. Usually, questions about
how technology affect our learning or practice really concern our
values and how we choose technologies, not the technology itself. And
that's the basis on which I address such questions. I am not saying
technology is neutral, but that it is created, adopted, and developed
over time in a dialog with people's desires.

I respect the questions posed by Nicholas Carr in his Atlantic
article--although it's hard to take such worries seriously when he
suggests that even the typewriter could impoverish writing--and would
like to allay his concerns. The question is all about people's
choices. If we value introspection as a road to insight, if we
believe that long experience with issues contributes to good judgment
on those issues, if we (in short) want knowledge that search engines
don't give us, we'll maintain our depth of thinking and Google will
only enhance it.

There is a trend, of course, toward instant analysis and knee-jerk
responses to events that degrades a lot of writing and discussion. We
can't blame search engines for that. The urge to scoop our contacts
intersects with the starvation of funds for investigative journalism
to reduce the value of the reports we receive about things that are
important for us. Google is not responsible for that either (unless
you blame it for draining advertising revenue from newspapers and
magazines, which I don't). In any case, social and business trends
like these are the immediate influences on our ability to process
information, and searching has nothing to do with them.

What search engines do is provide more information, which we can use
either to become dilettantes (Carr's worry) or to bolster our
knowledge around the edges and do fact-checking while we rely mostly
on information we've gained in more robust ways for our core analyses.
Google frees the time we used to spend pulling together the last 10%
of facts we need to complete our research. I read Carr's article when
The Atlantic first published it, but I used a web search to pull it
back up and review it before writing this response. Google is my
friend.

Will we live in the cloud or the desktop?

Our computer usage will certainly move more and more to an environment
of small devices (probably in our hands rather than on our desks)
communicating with large data sets and applications in the cloud.
This dual trend, bifurcating our computer resources between the tiny
and the truly gargantuan, have many consequences that other people
have explored in depth: privacy concerns, the risk that application
providers will gather enough data to preclude competition, the
consequent slowdown in innovation that could result, questions about
data quality, worries about services becoming unavailable (like
Twitter's fail whale, which I saw as recently as this morning), and
more.

One worry I have is that netbooks, tablets, and cell phones will
become so dominant that meaty desktop systems will rise in the cost
till they are within the reach only of institutions and professionals.
That will discourage innovation by the wider populace and reduce us to
software consumers. Innovation has benefited a great deal from the
ability of ordinary computer users to bulk up their computers with a
lot of software and interact with it at high speeds using high quality
keyboards and large monitors. That kind of grassroots innovation may
go away along with the systems that provide those generous resources.

So I suggest that cloud application providers recognize the value of
grassroots innovation--following Eric von Hippel's findings--and
solicit changes in their services from their visitors. Make their code
open source--but even more than that, set up test environments where
visitors can hack on the code without having to download much
software. Then anyone with a comfortable keyboard can become part of
the development team.

We'll know that software services are on a firm foundation for future
success when each one offers a "Develop and share your plugin here"
link.

Will social relations get better?

Like the question about Google, this one is more about our choices
than our technology. I don't worry about people losing touch with
friends and family. I think we'll continue to honor the human needs
that have been hard-wired into us over the millions of years of
evolution. I do think technologies ranging from email to social
networks can help us make new friends and collaborate over long
distances.

I do worry, though, that social norms aren't keeping up with
technology. For instance, it's hard to turn down a "friend" request
on a social network, particularly from someone you know, and even
harder to "unfriend" someone. We've got to learn that these things are
OK to do. And we have to be able to partition our groups of contacts
as we do in real life (work, church, etc.). More sophisticated social
networks will probably evolve to reflect our real relationships more
closely, but people have to take the lead and refuse to let technical
options determine how they conduct their relationships.

Will the state of reading and writing be improved?

Our idea of writing changes over time. The Middle Ages left us lots of
horribly written documents. The few people who learned to read and
write often learned their Latin (or other language for writing) rather
minimally. It took a long time for academies to impose canonical
rules for rhetoric on the population. I doubt that a cover letter and
resume from Shakespeare would meet the writing standards of a human
resources department; he lived in an age before standardization and
followed his ear more than rules.

So I can't talk about "improving" reading and writing without
addressing the question of norms. I'll write a bit about formalities
and then about the more important question of whether we'll be able to
communicate with each other (and enjoy what we read).

In many cultures, writing and speech have diverged so greatly that
they're almost separate languages. And English in Jamaica is very
different from English in the US, although I imagine Jamaicans try
hard to speak and write in US style when they're communicating with
us. In other words, people do recognize norms, but usage depends on
the context.

Increasingly, nowadays, the context for writing is a very short form
utterance, with constant interaction. I worry that people will lose
the ability to state a thesis in unambiguous terms and a clear logical
progression. But because they'll be in instantaneous contact with
their audience, they can restate their ideas as needed until
ambiguities are cleared up and their reasoning is unveiled. And
they'll be learning from others along with way. Making an elegant and
persuasive initial statement won't be so important because that
statement will be only the first step of many.

Let's admit that dialog is emerging as our generation's way to develop
and share knowledge. The notion driving Ibsen's Hedda Gabler--that an
independent philosopher such as Ejlert Løvborg could write a
masterpiece that would in itself change the world--is passé. A
modern Løvborg would release his insights in a series of blogs
to which others would make thoughtful replies. If this eviscerated
Løvborg's originality and prevented him from reaching the
heights of inspiration--well, that would be Løvborg's fault for
giving in to pressure from more conventional thinkers.

If the Romantic ideal of the solitary genius is fading, what model for
information exchange do we have? Check Plato's Symposium. Thinkers
were expected to engage with each other (and to have fun while doing
so). Socrates denigrated reading, because one could not interrogate
the author. To him, dialog was more fertile and more conducive to
truth.

The ancient Jewish scholars also preferred debate to reading. They
certainly had some received texts, but the vast majority of their
teachings were generated through conversation and were not written
down at all until the scholars realized they had to in order to avoid
losing them.

So as far as formal writing goes, I do believe we'll lose the subtle
inflections and wordplay that come from a widespread knowledge of
formal rules. I don't know how many people nowadays can appreciate all
the ways Dickens sculpted language, for instance, but I think there
will be fewer in the future than there were when Dickens rolled out
his novels.

But let's not get stuck on the aesthetics of any one period. Dickens
drew on a writing style that was popular in his day. In the next
century, Toni Morrison, John Updike, and Vladimir Nabokov wrote in a
much less formal manner, but each is considered a beautiful stylist in
his or her own way. Human inventiveness is infinite and language is a
core skill in which we we all take pleasure, so we'll find new ways to
play with language that are appropriate to our age.

I believe there will always remain standards for grammar and
expression that will prove valuable in certain contexts, and people
who take the trouble to learn and practice those standards. As an
editor, I encounter lots of authors with wonderful insights and
delightful turns of phrase, but with deficits in vocabulary, grammar,
and other skills and resources that would enable them to write better.
I work with these authors to bring them up to industry-recognized
standards.

Will those in GenY share as much information about themselves as they age?

I really can't offer anything but baseless speculation in answer to
this question, but my guess is that people will continue to share as
much as they do now. After all, once they've put so much about
themselves up on their sites, what good would it do to stop? In for a
penny, in for a pound.

Social norms will evolve to accept more candor. After all, Ronald
Reagan got elected President despite having gone through a divorce,
and Bill Clinton got elected despite having smoked marijuana.
Society's expectations evolve.

Will our relationship to key institutions change?

I'm sure the survey designers picked this question knowing that its
breadth makes it hard to answer, but in consequence it's something of
a joy to explore.

The widespread sharing of information and ideas will definitely change
the relative power relationships of institutions and the masses, but
they could move in two very different directions.

In one scenario offered by many commentators, the ease of
whistleblowing and of promulgating news about institutions will
combine with the ability of individuals to associate over social
networking to create movements for change that hold institutions more
accountable and make them more responsive to the public.

In the other scenario, large institutions exploit high-speed
communications and large data stores to enforce even greater
centralized control, and use surveillance to crush opposition.

I don't know which way things will go. Experts continually urge
governments and businesses to open up and accept public input, and
those institutions resist doing so despite all the benefits. So I have
to admit that in this area I tend toward pessimism.

Will online anonymity still be prevalent?

Yes, I believe people have many reasons to participate in groups and
look for information without revealing who they are. Luckily, most new
systems (such as U.S. government forums) are evolving in ways that
build in privacy and anonymity. Businesses are more eager to attach
our online behavior to our identities for marketing purposes, but
perhaps we can find a compromise where someone can maintain a
pseudonym associated with marketing information but not have it
attached to his or her person.

Unfortunately, most people don't appreciate the dangers of being
identified. But those who do can take steps to be anonymous or
pseudonymous. As for state repression, there is something of an
escalating war between individuals doing illegal things and
institutions who want to uncover those individuals. So far, anonymity
seems to be holding on, thanks to a lot of effort by those who care.

Will the Semantic Web have an impact?

As organizations and news sites put more and more information online,
they're learning the value of organizing and cross-linking
information. I think the Semantic Web is taking off in a small way on
site after site: a better breakdown of terms on one medical site, a
taxonomy on a Drupal-powered blog, etc.

But Berners-Lee had a much grander vision of the Semantic Web than
better information retrieval on individual sites. He's gunning for
content providers and Web designers the world around to pull together
and provide easy navigation from one site to another, despite wide
differences in their contributors, topics, styles, and viewpoints.

This may happen someday, just as artificial intelligence is looking
more feasible than it was ten years ago, but the chasm between the
present and the future is enormous. To make the big vision work, we'll
all have to use the same (or overlapping) ontologies, with standards
for extending and varying the ontologies. We'll need to disambiguate
things like webbed feet from the World Wide Web. I'm sure tools to
help us do this will get smarter, but they need to get a whole lot
smarter.

Even with tools and protocols in place, it will be hard to get
billions of web sites to join the project. Here the cloud may be of
help. If Google can perform the statistical analysis and create the
relevant links, I don't have to do it on my own site. But I bet
results would be much better if I had input.

Are the next takeoff technologies evident now?

Yes, I don't believe there's much doubt about the technologies that
companies will commercialize and make widespread over the next five
years. Many people have listed these technologies: more powerful
mobile devices, ever-cheaper netbooks, virtualization and cloud
computing, reputation systems for social networking and group
collaboration, sensors and other small systems reporting limited
amounts of information, do-it-yourself embedded systems, robots,
sophisticated algorithms for slurping up data and performing
statistical analysis, visualization tools to report the results of
that analysis, affective technologies, personalized and location-aware
services, excellent facial and voice recognition, electronic paper,
anomaly-based security monitoring, self-healing systems--that's a
reasonable list to get started with.

Beyond five years, everything is wide open. One thing I'd like to see
is a really good visual programming language, or something along those
lines that is more closely matched to human strengths than our current
languages. An easy high-level programming language would immensely
increase productivity, reduce errors (and security flaws), and bring
in more people to create a better Internet.

Will the internet still be dominated by the end-to-end principle?

I'll pick up here on the paragraph in my answer about takeoff
technologies. The end-to-end principle is central to the Internet I
think everybody would like to change some things about the current
essential Internet protocols, but they don't agree what those things
should be. So I have no expectation of a top-to-bottom redesign of the
Internet at any point in our viewfinder. Furthermore, the inertia
created by millions of systems running current protocols would be hard
to overcome. So the end-to-end principle is enshrined for the
foreseeable future.

Mobile firms and ISPs may put up barriers, but anyone in an area of
modern technology who tries to shut the spiget on outside
contributions eventually becomes last year's big splash. So unless
there's a coordinated assault by central institutions like
governments, the inertia of current systems will combine with the
momentum of innovation and public demand for new services to keep
chokepoints from being serious problems.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl