| Anandeep さんのプロフィールPannu's Pontificationsブログリストつながり | ヘルプ |
|
11月29日 Does Personalization work for searchRaul Valdes-Perez of Vivisimo says that personalization of search (i.e. figuring out what I want from a search that is different from what a random bozo wants ) is not a fruitful avenue for exploitation (academic-speak for "can't get no satisfaction"). He wrote a one page paper about it - heres (PDF) how it goes!
Raul was at at Carnegie Mellon University when I was there - he was working on how scientific discoveries are made or something like that and wrote some AI programs that tried to simulate the process of scientific discovery. His other claim to fame was that his adviser was Herb Simon, CMU's resident nobel laureate who introduced the concept of bounded rationality in his Political Science disseration at the University of Chicago. Herb was one of the founding faculty of the Graduate School of Industrial Administration. CMU was so quant - they couldnt bring themselves to call their business school the Business School!
Herb was one half of the Simon-Newell duo, who are considered "Fathers of AI" ,though there are enough claims about being the father(s) of AI that I am afraid to be seen with the mother! Herb and Allen Newell built GPS (Global Problem Solver) - the first "means-end" or goal driven forward and backward reasoning solver. I was on one of Allen Newell's projects when I was a graduate student researcher.
As with many AI folks Raul ended up in search, Vivisimo is a CMU spin off. One of my good buddies, Liren Chen, who is among the best natural language and seach programmers I know was with them before he moved to Google.
Raul basically says (i) interests change, (ii) profiles of users are not reliable and (3) data gathered is non-reliable as an indicator of preferences
I disagree that personalization is a dead end - today we have only the two word queries (I think the average search is of the order of 2 words) to infer what a user wants. From aggregrating a lot of those two word queries we can even make decent guesses - but having extra information that drives inference is not always bad.
Techniques exist to address each one of Raul's objections. I think Raul is reacting to the hype that based JUST on our click stream the search results would be better than those we have now.
Where personalization scores is in segmentation - identifying you as being different from other users or the same as other users. The segmentation provides biasing of the aggregrate search, and doesnt act as a basis for the search itself. To take one of Raul's examples - if you are identified as a "doctor" segment - typing anthrax means that you are most likely to be looking for the anthrax disease entries that the aggregate search is aware of. If you are identified as a "rock fan" segment , anthrax means nothing else but the rocking heavy metal group. The words you use as a doctor (in other activities you do) have a co-relation with the words that will be in the documents describing anthrax the disease (for example "golf game", "nurses", "&** insurance companies") and these can be used to make sure that the documents get biased by the query AND by the words in your profile. Whereas the words you use in other activities as a rocker are probably "sex", "drugs", "rock" and "roll"!
The challenge is figuring out the words you use consistently and coupled with certain activities - you may be part of both the "doctor" segment and the "rock" star segment. Another would be mapping the words that describe a particular segment to a segment. Also diagnosing if you are doing one off activities like searching for a new newspaper story or as part of your unchanging preference. Amazon does a great job already (in a constrained domain) about mapping the words you use, the books you buy and the meta-data associated with a book to bias the search for those books. I disagree that buying the books & spending time reading them makes this data more valuable than the data gathered from web page visits. I am committed enough to some websites, that I am sure statistical inference can pick it up!
All this may be hard but I believe it is solvable, and I would go as far as to say, solvable using the same techniques used for making search more relevant. トラックバック (1 件)この記事のトラックバックの URL は次のとおりです。 http://anandeep.spaces.live.com/blog/cns!4A061826CDFFAB7!116.trak この記事を参照しているブログ
|
|
|