Thursday, March 13, 2008

Welcome

Welcome All! Let me introduce myself. My name is Len Bertoni and I am a member of the team acmehill consisting of myself, my wife, my daughter and my son. We are competing for the netflix prize (http://www.netflixprize.com/) and have recently cracked the top 10.

Team acmehill downloaded the netflix prize data about 1 year ago, but only recently "got serious" about attempting to do anything with it, starting our leaderboard climb in Nov 2007. Our computing power is rather modest, consisting of a 2 year old Emachine (3GHz, 2GB RAM, 160G HD) purchased from Wal-Mart. However, since the contest is as much thinking about why and how people rate movies as it is about applying proven (and not so proven!) mathematical techniques to the provided data, we have not been limited by hardware too much. All programs are written in VB.Net using Visual Basic Express 2005. All code is written "from scratch" and no external mathematical libraries are used.

My family provides me with new ideas to test out while I subject them to countless "fun hours" of delving into matrix factorization (Dad, what is a matrix anyway?), neural nets (does anyone really know how they work anyway?), etc.

I have tried with varying success the following techniques (many do not survive in my final blend as I can not bear to blend 100s of files to produce my final prediction set):

KNN on Movies,
Matrix Factorization,
User/Movie Clustering,
Neural Nets,
Restricted Boltzmann Machines,
Anchoring,
Milestones,
Global Effects Removal

Right now I am implementing KNN on users, which is quite a bit more computer intensive, as the sheer number of users (1/2 million) makes nearest neighbor determination more problematic. Hopefully team acmehill will have some success with this approach.

16 comments:

Unknown said...

Congratulations, very impressive score.

Can you please elaborate a bit more (perhaps a couple of links) on these methods:

"Anchoring"
"Milestones"

by32768 said...

Hi, I implemented both viewer and movie clustering, using completely symmetrical approaches. I found movie clustering is significantly worse than viewer clustering, I still don't know if this is just the nature of things or if there's a bug in my movie clustering code.

What's your experience ?

Acmehill said...

Anchoring is described here (http://en.wikipedia.org/wiki/Anchoring). It is basically the idea that people select 1 main attribute when considering a decision and then adjust the other attributes to fit their initial assessment. For example, you decide to watch an early movie by your favorite actor. He/She wasn't famous yet and isn't the star of the movie that you are watching, but his/her performance may still be the deciding factor in whether you like the movie or not. I employ anchoring by trying to determine at any given time (yes, I use the rating date) to determine what "features" a user is basing his/her ratings on.

Milestones (my word) represent movies that, if rated, have the greatest effect on the ratings of other movies (by the same user). For example, once you've seen "The Sting", you may have a very different opinion of other "hustle" movies. By calculating the effect previously rated (viewed?) movies have, the order of ratings becomes important.

Acmehill said...

dmnewbie,

I have also implemented movie and user clustering and have seen the same effect as you.

I think that it is because multiple movies watched by 1 user vary less than multiple users watching 1 movie. The STD of the ratings around the user average is less than the STD around the movie average (slightly). This, coupled with the higher support per movie makes customer clustering more effective. It harder to cluster movies because their increased STD and greater support make finding like clusters difficult.

by32768 said...

Thanks for your response, that settled one thing for me.

Would you care to describe the topology of your neural network approach ? I'd like to implement it, but my neural network experience is based on image processing, and I'm not sure how best to setup the network for this task.

D said...

Hello,

If you would be interested in working on a recommender system for web startup focused on music, please contact me.

Thank you.

Sincerely,
David
david@storylog.com

by32768 said...

Hi,

Did you get anywhere with user-user KNN ?

I implemented this and let it generate predictions for 10% of the probe, on residues of global effects.

The RMSE of this 10% is about 0.92, not bad but it contributes very little (0.0002) when blended with other predictions.

Unknown said...

A simple suggestion for insane Netflix customers who forget they have seen a movie --- probably most important considering our aging population.

Can you say -- you have already seen this movie - but you would really like this one.....

Of course this could cut into their revenue?

John Boehm said...

I read about your problem on the NYT. If the datasets allow you to factor other ratings from the same user that generated the review of ND, could you discount their review of ND? It seems that ND is an exception to normal correlation and needs to have the rating adjusted for relevance. I have not looked at the datasets so I am not sure if you can make a secondary rating based on individual user ratings. If ratings can be associated with a unique username and their individual preferences taken into consideration; the ND score could be adjusted to how much they would have actually rated the film if it was consistent with their outer ratings.

TO be clear, I have no idea what I am talking about and know you have devoted hours to this problem so I have no position being right or wrong. It just struck me reading the column.

Mike said...

Hi Len,

Curious if you've found any correlation between the taste of particular viewers and that of particular critics.

Particularly with regards to 'the Napoleon Dynamite problem,' I would think that critics who have more established taste and a refined set of criteria for liking certain films might offer a window into predicting what casual viewers who often agree with these critics might also like.

Good luck,
M

Aaron Scott said...

Len, some ideas...

Maybe the answer is in the descriptons of the films. There are literally hundreds of categories each film can fall into. In the case of Napoleon Dynamite, I'm thinking of words like "quirky" and "offbeat" and "unusual" and "comedy" and "high school" and "nerdy," etc.

Of course, there is also the name of the stars in the films, the amount of money it took to make it, maybe even opening day (since expected blockbusters tend to be released at certain times of the year, I understand)

As you can imagine, I could go on for a long time.

It occurred to me that I often like films by the same director (Shymalan, for instance), at least for the most part.

My way of trying to solve the matter was by first identifying hundreds of sub-categories that classify certain films.

So, while "Napoleon" is listed as, I guess, a comedy by Netflix, there are many sub-categories that also describe it.

It would seem logical that a film that is is classified very nearly like "Napoleon" in terms of category and sub-categories would likely receive a similar rating from viewers.

To tweak this, you might add weights to certain categories. For instance, "Sci-Fi" may carry more weight than "teen" when you're talking about a film sci-fi film that features teens.

And as for sequels, I would imagine that most people at least like sequels if the same stars are involved--some form of nostalgia, I suppose.

OK, so here's how you do it....

You take a sampling of the films in a particular category (drama, say). Then, using as many sub-categories as you can, you label each film with as many descriptive terms as is suitable (there might be a list of, say, 100 sub-categories from which you would choose, since presumeably you wouldn't necessarily have to identify EVERY sub-category).

Then, once each film has a "profile" based on the sub-categories which describe it, you compare things.

Is there a particular director that winds up overwhelmingly in the 5-star ranks? Or one that doesn't? Are there any sub-categories that tend to be attached to low-rated films (by the way, "film budget" might be important, too).

Further, can you identify common things that make for a well-liked film?

For instance, if you find that someone who likes "Comedy" will have a strong statistical chance of also liking comedy films that have this or that star, and this or that director...that would mean that they could likely watch ANY comedy...and Netflix could recommend another comedy based on your findings and, sure enough, they'd likely enjoy it, too.

By the way, "Year made" and "Year Portrayed" are likely important. Some people prefer older films...and some people prefer films that are about a particular time period.

If these help, please e-mail me at aaronescott@yahoo.com.--$50,000 would be plenty! Good luck!

Unknown said...

I'd like to be able to view movies in confidence that there won't be offensive content... which I realize is a very different search parameter for each of us. Perhaps the solution would be allowing users to have a customizable 2-tier aspect of search to allow personalized blacklisting of (1) words and (2) actions. 'Twould require quite a complete indexing of movies... maybe some of the "family" movie rating sites (many can be found via Google) would sell their lists of things like "12 instances of D*** and 4 occurrences of F***." Quite an unusual job for folks claiming to be offended by such things, huh?

We use the same kind of keyword filtering on our spam firewall filtering appliance at work. Seems like some of the heuristics might crossover well from that kind of technology- so there's another idea of how to add someone else's wheel rather than reinventing it.

Example movie:

"Support Your Local Sheriff", Rated G, but with regular shots of and references to the local brothel and a closing scene where the "respected leaders of the community" emerge from the brothel in various states of undress. The way that the whore house is presented is more of the problem, but in general I would like to know ahead of family movie time if prostitution is going to be shown as normal and acceptable. Doesn't mean I won't watch the movie, but I might be a little more prepared as to audience and debriefing material : )

Please forgive me for offending anyone who thinks it's morally acceptable to pay for sex; I've just got this big Jesus thing going on in my life.

ferrini said...

I read the Times article today and found the problem interesting, especially the part about "Napoleon Dynamite" and other problem movies.

Seems to me those movies share something. They are exactly the sort of films, the self described hip types hate to allow as films they'd predictably like or not like, because they fall into the netherland between hip and gauche. The game is not judging the movie qua movie, but judging the crowd in the abstract, who may pan or rave. This is a crowd of fools we're talking about and their opinions are at least as much about being appreciated as wise cultural consumers as they are about actually liking or not liking a given offering.
I don't know a fig about programming nor what data you have to use indirectly to forecast a given customer's Napoleon Dynamite tastes but I do think you could create a model which is a referendum on said customer's scores on the other hard movies. See if he/she tends to pan or like those films and guess they're more likely to see Napoleon Dynamite that way, or any of the others in the category. The issue here is that the category defies prediction because that's the objective of the customer and you have to accept a relatively high rate of error but still improve by taking it into consideration.
Just a thought.

Ishmael said...

Hi Len,
Just read the article in the NYT. Interesting Napolean Dynamite problem. Have you considered simply assigning a random rating to this film (and others like it)?

When building recommendations, ignore all ratings for these films and give it a more-or-less random value.

Milan Mitrovich
Wrightsville, PA

Unknown said...

The only way to make a leap in accuracy is to truly analyze the movies themselves.
The Tempo, Rhythm, Color, Flicker rate, speed the characters talk.

I pondered this idea and found the following research paper

"Application of Computational Media Aesthetics Methodology to Extracting Color Semantics in Film"

"ABSTRACT
Using lm grammar as the underpinning, we study the ex-
traction of structures in video based on color using a wide
con guration of clustering methods combined with existing
and new similarity measures. We study the visualisation
of these structures, which we call Scene-Cluster Temporal
Charts and show how it can bring out the interweaving of
di erent themes and settings in a lm. We also extract color
events that lmmakers use to draw/force a viewer's atten-
tion to a shot/scene. This is done by rst extracting a set
of colors used rarely in lm, and then building a probabilis-
tic model for color event detection. We demonstrate with
experimental results from ten movies that our algorithms
are e ective in the extraction of both scene-cluster tempo-
ral charts and color events."

I'm not sure what kind of off the shelf software exits to extract this kind of data http://www.virage.com and Avid or Adobe Premier Pro and/or other movie editing programs is a start.

Perhaps a company would sponsor you with there software in hopes that you will win the prize. It would be excellent way to market/advertise their company.

Congrats on your New York Times mention and Good luck!

Dave said...

Len,

Would love to talk to you about a company I have founded. I'm a successful entrepreneur already, but have a great idea that you could potentially be a part of. Ping me if you'd like to chat. -Dave