Blog
IBM's Watson Can Save Newspaper Headlines
IBM's Watson understands puns and wordplay. Why can't search engines as well?
I visited the Google headquarters at Mountain View a few years ago and met a man whose job title, at least informally, was "search evangelist." (Given the Peter Pan-like argot of Silicon Valley firms, I wouldn't be surprised if this was his official title.) We were talking about all the things people do to make their sites more attractive to search engines, which made him very agitated. The responsibility of ranking a site correctly, he said, lies almost completely with the search engine, not the webmaster. If a good site is not ranked as highly as it should be, he said, it's the search engine's fault, not the developer's for failing to construct her content and HTML correctly.
We know this practice as "search engine optimization," of course, and companies pay tremendous sums for people to tell them how to make their sites as search-friendly as possible. Some of these consultants are very knowledgeable, some are voodoo artists, and all of them are fundamentally trying to game a black box.
This practice has been particularly devastating to journalism, because it has taken all of the fun out of writing headlines at Web publications. Titling a piece used to be the height of wordplay. Now it's a process of jamming as many keywords into the headline as possible so that people searching for any of the words associated with the story might chance upon the article. At Slate, where I worked for about four years, they write three or four headlines--a clever one, a searchable one, and one that actually describes what the article is about (in theory). I don't know how well that works from a technical standpoint.
When IBM debuted Watson, its triumph of artificial intelligence that could beat the best humans at Jeopardy!, I wrote an article about the practical implications of the technology. One of those advances was the ability for a computer to recognize puns and other sorts of wordplay, since so many of the game show's clues involve little linguistic jokes. It occurs to me that search engines, should they be able to replicate this ability, could apply the same parsing of language to the titles of Web pages. A story about hip hop's ballooning popularity in nursing homes, for example, would not need to fear losing its placement in the rankings by calling itself "One Hundred Years of a Holla-tude."
In an ideal world, perhaps searching engines could even reward a particlarly good headline--one sings out to the reader, not the computer. Call it search engine pop-tizimation.
Posted by chris on April 10 2012
We Need a Wikipedia for Music Transcriptions
Would be possible to use the WikiMedia framework for a collaborative music transcription project?
Transcribing the improvised solos of the greats is a core part of jazz education, but it's rare that people share their (often hand-written) notes on, say, exactly what Clifford was doing in that second chorus of "Step Lightly."
Part of the point of transcribing solos is the fun and frustration of working through a recording yourself. But it would be nice to compare notes now and again. I just posted a pdf of an attempt to transcribe the solos one of my favorite Modern Jazz Quartet pieces. But I'm sure it's rife with errors, and there's no elegant way for people to point them out on these flat images.
This makes me wonder if it would be possible to use the WikiMedia framework for a collaborative music transcription project. There appear to already be some impressive tools for rendering musical notation in Javascript. I don't know exactly how the process of editing an existing transcription would work. Ideally, the "edit" tab would render the notation in a manipulative form common to most notation software, which instant playback and so forth. (I recommend SoundManager 2 for JS audio.) Like any wiki model, of course, all previous versions of a particular tune would be archived. A pure Javascript rendering of interactive notation would be a major undertaking, but I see no reason it isn't technically possible.
Posted by chris on December 8 2011
Flash and its Discontents
For intensive visualizations, I fear the short-term solution will be to code interactives twice, once in HTML5 for modern browsers with a Flash fallback for the IE crowd.
Until about I year ago, I coded most of my projects in Flash and its cousin Flex, which is developed entirely in code without the benefit of Flash's visual interface. (It's sort of like the different between algebra and geometry, though since Flash moved to Actionscript 3, it resembles Flex a lot more.) Both platforms can draw very quickly and create enormously complex animations and interactives, though they can often eat up a lot of system resources in the process.
The general move away from Flash seems to have begun with the introduction of the first iPhone, which did not support it. While I admire the late Steve Jobs, I still resent that he closed down the development for his iProducts to a closed SDK, essentially doubling the work for people who want their applications to run smoothly on the Web and Apple products (not to mention other mobile platforms).
Sheerly for the sake of efficiently, we started used the Javascript library Raphael for most visualizations. Raphael is based on the drawing platform SVG and is also compatible with Microsoft's equivalent structure, so it made life easier to do things once. I can't say enough about how useful that library has been to my data visualization projects. It has the added benefit of running natively in the browser, so it doesn't matter whether the user has updated his or her Flash player recently (or ever).
I don't think this is a permanent solution, however. Raphael cannot draw as quickly as Flash, making projects like county-by-county maps over slow to the point that they're unusable. The obvious successor is HTML5, whose canvas feature makes for much more elegant drawing inside the browser. Unfortunately, the 30 percent or so of users who still use Internet Explorer 8 or below cannot view HTML5 content, and most of us can't afford to lose that audience. (IE9 can render many parts of HTML5, but requires an upgrade to Vista or Windows 7.)
For intensive visualizations, I fear the short-term solution will be to code interactives twice, once in HTML5 for modern browsers with a Flash fallback for the IE crowd. A very clever developer might be able to write a meta language that could port to either platform with minimum recoding. This would be a major asset to anyone interested in reaching a maximum audience with powerful applications with dazzling capabilities.
Posted by chris on November 20 2011
What's Hypermedia?
"By now the word 'hypertext' has become generally accepted for branching and responding text, but the corresponding word 'hypermedia', meaning complexes of branching and responding graphics, movies and sound – as well as text – is much less used. Instead they use the strange term 'interactive multimedia': this is four syllables longer, and does not express the idea of extending hypertext."
You may notice that the tagline on this site refers to "hypermedia." The term was coined alongside "hypertext" in 1963 by a man named Ted Nelson, but only one would stick. "Hypertext"--literally "beyond text"--is etched in the 'H' and 'T' of HTML, but you don't often hear the second term. Nelson commented on this in his book Literary Machines.
"By now the word 'hypertext' has become generally accepted for branching and responding text, but the corresponding word 'hypermedia', meaning complexes of branching and responding graphics, movies and sound – as well as text – is much less used. Instead they use the strange term 'interactive multimedia': this is four syllables longer, and does not express the idea of extending hypertext."
Aside from syllable conservation, I like the term because it broadly encompasses the goal of Web-based graphics and interactives, which is to endow them with more information than is possible in print. Right now, this might mean a map that animate through time, changing color as datapoints change and allowing the user to mouse over counties to get an individual report for that locality. In the future, I look forward to "hypermovies" that let you mouse over or somehow point to an actor and get a list of what other movies you've seen him or her in.
In a way, it's is a lazy phrase that serves as a catchall for whatever people are producing online. But as a concept, "beyond media" sounds as appealing as Web development gets.
Posted by chris on November 20 2011