Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 22 2011

Four short links: 22 December 2011

  1. Fuzzy String Matching in Python (Streamhacker) -- useful if you're to have a hope against the swelling dark forces powered by illiteracy and touchscreen keyboards.
  2. The Business of Illegal Data (Strata Conference) -- fascinating presentation on criminal use of big data. "The more data you produce, the happier criminals are to receive and use it. Big data is big business for organized crime, which represents 15% of GDP."
  3. Isarithmic Maps -- an alternative to chloropleths for geodata visualization.
  4. Server-Side Javascript Injection (PDF) -- a Blackhat talk about exploiting backend vulnerabilities with techniques learned from attacking Javascript frontends. Both this paper and the accompanying talk will discuss security vulnerabilities that can arise when software developers create applications or modules for use with JavaScript-based server applications such as NoSQL database engines or Node.js web servers. In the worst-case scenario, an attacker can exploit these vulnerabilities to upload and execute arbitrary binary files on the server machine, effectively granting him full control over the server.

August 17 2010

Four short links: 17 August 2010

  1. Demo of Stemming Algorithms -- type in text and see what it looks like when stemmed with different algorithms provided by NLTK. (via zelandiya on Twitter)
  2. Crowdmap -- hosted Ushahidi. (via dvansickle on Twitter)
  3. Opinions vs Data -- talks about the usability of a new gmail UI element, but notable for this quote from Jakob Nielsen: In my two examples, the probability of making the right design decision was vastly improved when given the tiniest amount of empirical data. (via mcannonbrookes on Twitter)
  4. The Next Silicon Valley -- long and detailed list of the many forces contributing to Silicon Valley's success as tech hub, arguing that the valley's position is path-dependent and can simply be grown ab initio in some aspiring nation's co-prosperity zone of policy whim. (via imran and timoreilly on Twitter)

June 17 2010

Four short links: 17 June 2010

  1. What is IBM's Watson? (NY Times) -- IBM joining the big data machine learning race, and hatching a Blue Gene system that can answer Jeopardy questions. Does good, not great, and is getting better.
  2. Google Lays Out its Mobile Strategy (InformationWeek) -- notable to me for Rechis said that Google breaks down mobile users into three behavior groups: A. "Repetitive now" B. "Bored now" C. "Urgent now", a useful way to look at it. (via Tim)
  3. BP GIS and the Mysteriously Vanishing Letter -- intrigue in the geodata world. This post makes it sound as though cleanup data is going into a box behind BP's firewall, and the folks who said "um, the government should be the depot, because it needs to know it has a guaranteed-untampered and guaranteed-able-to-access copy of this data" were fired. For more info, including on the data that is available, see the geowanking thread.
  4. Streamhacker -- a blog talking about text mining and other good things, with nltk code you can run. (via heraldxchaos on Delicious)

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!

Schweinderl