Movie Genre Ratings - Addendum

Share this article: TwitterFacebookGoogle+Diaspora*Email

This article was created with Jupyter. To see the code generating all figures and outputs, click 'Show Code'


In a previous post, I looked at trends in critic and audience ratings of movies. One finding was that critics on average rate movies lower than audiences. One potential explanation for this is that there is more selection bias with audience ratings - only people who are an audience likely to enjoy a film will tend to see (and therefore rate) it. In other words, if you asked an audience member or a critic how much enjoyment they would get out of seeing a film they rated a 5, you would get similar responses.

However, an alternative explanation is that when I compare audience and critic responses, I'm comparing apples and oranges. Maybe critics, having made a career of rating things, actually have a better calibrated rating scale. As a population, they use a wider range of values. Under this explanation, a 5 from an audience (around the lowest I saw for any movie) is actually more like a 2 from a critic.

Which interpretation is correct? I originally went with the 'they use the same scale' interpretation, partially because I think it holds merit and partially because it leads to numbers that are easier to digest. A 2 point difference on a 10 point scale is meaningful to most people in a way that 1.5 standard deviations is not. However, it's still interesting to see what happens if we use normalized values.

I executed all of the same code, but with normalized (z-scored) values. For the non-stats savvy people: I've just made it so the range of the ratings are about the same for the audience and critic ratings, and made it so an 'average' rating is a 0. This is easier to see in the first figure (compare to the previous post).

Visualizing Normalized Movie Ratings

In [1]:
%matplotlib inline

#Import some tools we'll use
import pandas as pd
import numpy as np
from bokeh.palettes import Category20, Category20b
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, Range1d, LabelSet, Label, HoverTool
from bokeh.charts.attributes import CatAttr
from bokeh.charts import Bar
from scipy.stats.mstats import zscore

#Only output floats to one decimal place in tables
pd.options.display.float_format = '{:,.1f}'.format

In [2]:
###Just copying and pasting the code from the other notebook and zscoring

#Read in the data
mov = pd.read_csv('blockbuster-top_ten_movies_per_year_DFE.csv')

#Remove columns where data is missing
mov = mov[(mov['rt_score'] != 0) & (mov['rt_audience_score'] != 0) & (~pd.isnull(mov['Genre_1']))]

#rt_score is on a 1 to 10 scale, rt_audience_score is 0.5 to 5. Let's rescale
mov['rt_audience_score'] = mov['rt_audience_score']*2

#Only include those columns we want
mov = mov[['title', 'rt_audience_score','rt_score','Genre_1','Genre_2','Genre_3']]

mov['rt_audience_score'] = zscore(mov['rt_audience_score'])
mov['rt_score'] = zscore(mov['rt_score'])

#Calculate difference between audience and critic scores (to use later)
mov['dif'] = mov['rt_audience_score'] - mov['rt_score']

#Generate a unique color based on whatever the first genre is for each movie, for plotting
cat = Category20[20]
genres = mov['Genre_1'].unique()
mov['colors'] = [cat[np.where(genres == genre)[0][0]] for genre in mov['Genre_1']]

#Generate a genre string based on the three genres (to prevent hover tooltips from displaying nans)
mov['Genre_str'] = (mov['Genre_1'] + ' ' +
                    mov['Genre_2'].fillna('') + ' ' +
In [3]:
#Themes aren't really supported yet in Bokeh, so define a function to do styling
def styleBokeh(p):
    #Hide the bokeh toolbar
    p.toolbar_location = None
    #Format axis label fonts
    p.yaxis.axis_label_text_font_size = "12pt"
    p.yaxis.axis_label_text_font_style = "normal"
    p.xaxis.axis_label_text_font_size = "12pt"
    p.xaxis.axis_label_text_font_style = "normal"
    p.xaxis.major_label_text_font_size = "10pt"
    p.yaxis.major_label_text_font_size = "10pt"
    #Change title font size
In [4]:

#Make a data source for bokeh to use
mov_source = ColumnDataSource(mov)

#We want the axes on the same scale, so figure out min of both and max of both
low = mov[['rt_audience_score','rt_score']].min().min() - 0.2
high = mov[['rt_audience_score','rt_score']].max().max() + 0.2

#Define the hover over tooltips
hover = HoverTool(
        tooltips=[("Title", "@title"),
                  ("Audience Score", "@rt_audience_score{0.0}"),
                  ("Critic Score", "@rt_score{0.0}"),
                  ("Genres", "@Genre_str")

#Make the figure
p = figure(title = "Movie critic and audience ratings",
           x_range=Range1d(low, high),

#We'll put critic ratings on the x-axis and audience on the y-axis
p.xaxis[0].axis_label = 'Critic Rating'
p.yaxis[0].axis_label = 'Audience Rating'

#Add a diagonal line for where audience rating = critic rating
p.line((low,high), (low,high),line_color="red", line_width=5, alpha=0.5)

#Plot the points, coloring by Genre_1
p.scatter(x='rt_score', y='rt_audience_score', source=mov_source, color='colors', 
          size=8, fill_alpha = 1)


The audience and critic ratings have about the same ranges, and the red line seems to go straight through the center of the movies. Great! Now we can look at how this changes things by genre.

In [5]:

#Melt the genre columns so we only have one. This will give multiple entries for movies with more than one genre
melted_mov = pd.melt(
    mov.drop(['colors', 'Genre_str'],axis=1), 

#A little bit of clean up
melted_mov = melted_mov.drop('variable',1)
melted_mov = melted_mov.dropna()

#Group by genre
genre_groups = melted_mov.groupby('genre')

#Excluding genres with less than 20 entries
genre_revs = genre_groups.mean()[genre_groups['genre'].count() > 20]
In [6]:

#Make a data source for bokeh
genre_revs_source = ColumnDataSource(genre_revs)

#Get axes ranges
low = genre_revs[['rt_audience_score','rt_score']].min().min() - 0.05
high = genre_revs[['rt_audience_score','rt_score']].max().max() + 0.05

#Define the hover over tooltisp
hover = HoverTool(
        tooltips=[("Genre", "@genre"),
                  ("Audience Score", "@rt_audience_score{0.0}"),
                  ("Critic Score", "@rt_score{0.0}"),
                  ("Difference", "@dif")

p = figure(title='Audience and critic ratings by genre',
           x_range=Range1d(low, high),

p.xaxis[0].axis_label = 'Critic Rating'
p.yaxis[0].axis_label = 'Audience Rating'

#Add a line for where critic score = audience score
p.line((low,high), (low,high),line_color="red", line_width=5, alpha=0.5)

#Scatter plot the genre scores
p.scatter(x='rt_score', y='rt_audience_score', source=genre_revs_source,

#Add labels for each genre
labels = LabelSet(x='rt_score', y='rt_audience_score', text='genre', source=genre_revs_source, 
              x_offset=0, y_offset=5, text_font_size="7.5pt", text_align='center')

#Hide the bokeh toolbar
p.toolbar_location = None


Now we can see what genres audiences like more than average (above 0 on the y-axis), what critics like more than average (to the right of 0 on the x-axis). We can also see where audiences and critics disagree most relative to their normal ratings. So while audience members rate drama movies higher than critics do on average, when you take the audience's tendency to rate everything pretty high into account, critics actually like drama more.

Again this is easier to see if we take the difference.

In [7]:
genre_revs = genre_revs.sort_values('dif', ascending = False)
p = Bar(genre_revs.reset_index(),
        label=CatAttr(columns=['genre'], sort=False),
        title='Difference between critics and audience ratings by genre',
        ylabel='Audience ratings minus critic ratings',
p.toolbar_location = None

Now we see that Fantasy is still the genre audiences disagree with critics most about, but we also get a neat separation - there are genres audiences like relatively more (fantasy, adventure, and action), and those that critics like relatively more (thriller, drama, comedy, crime, romance). Sci-fi, animation, and family is where there is the most agreement.

For completeness, I'm including the top 10 lists I generated for the last post. The lists are notably quite similar.

In [8]:
#Make mov presentable for printing
print_mov = mov.drop(['Genre_1', 'Genre_2', 'Genre_3', 'colors'], axis=1)
print_mov.columns = ['Title', 'Audience Score', 'Critic Score', 'Difference', 'Genres']

Top 10 Fantasy Movies Audiences Love More Than Critics

We still get lots of twilight and superhero movies on this scale.

In [9]:
#Get the titles of the 10 fantasy movies with the greatest difference between audience and critics
fantasy_movie_titles = (melted_mov[melted_mov['genre'] == 'Fantasy'].
                        sort_values(by='dif', ascending=False)

#Make the table pretty and output it
fantasy_table = (print_mov
                 sort_values('Difference', ascending=False)
fantasy_table.index = range(1,11)
Title Audience Score Critic Score Difference Genres
1 300 1.6 -0.4 2.0 War Fantasy Action
2 The Twilight Saga: Breaking Dawn - Part 2 0.8 -1.1 2.0 Fantasy Drama Adventure
3 The Twilight Saga: New Moon 0.3 -1.5 1.8 Fantasy Drama Adventure
4 Pirates of the Caribbean: At World's End 0.8 -0.9 1.7 Fantasy Adventure Action
5 The Hobbit: An Unexpected Journey 1.6 0.0 1.6 Fantasy Adventure
6 Pirates of the Caribbean: Dead Man's Chest 1.1 -0.5 1.6 Fantasy Adventure Action
7 The Hobbit: The Desolation of Smaug 1.6 0.2 1.5 Fantasy Adventure
8 Man of Steel 1.1 -0.3 1.4 Fantasy Adventure Action
9 Thor: The Dark World 1.1 -0.3 1.4 Fantasy Adventure Action
10 The Hobbit: The Battle of the Five Armies 1.1 -0.2 1.3 Fantasy Adventure

Top 10 Movies Audiences Love More Than Critics

This list has mostly the same movies as the unstandardized version, with some of the ordering slightly different.

In [10]:
top10_table = print_mov.sort_values('Difference', ascending=False)[0:10]
top10_table.index = range(1,11)
Title Audience Score Critic Score Difference Genres
1 Bad Boys II 0.3 -2.0 2.3 Crime Comedy Action
2 Transformers 1.6 -0.6 2.3 Sci-Fi Adventure Action
3 Fast & Furious 6 1.9 -0.3 2.2 Thriller Crime Action
4 Transformers: Revenge of the Fallen 0.1 -2.1 2.2 Sci-Fi Adventure Action
5 300 1.6 -0.4 2.0 War Fantasy Action
6 The Twilight Saga: Breaking Dawn - Part 2 0.8 -1.1 2.0 Fantasy Drama Adventure
7 Despicable Me 2 1.9 0.0 1.9 Family Comedy Animation
8 The Twilight Saga: New Moon 0.3 -1.5 1.8 Fantasy Drama Adventure
9 National Treasure: Book of Secrets 0.3 -1.4 1.7 Mystery Adventure Action
10 Pirates of the Caribbean: At World's End 0.8 -0.9 1.7 Fantasy Adventure Action

Movies Critics Love More Than Audiences

Again, this mostly looks the same as the unstandardized list.

In [11]:
bot10_table = print_mov.sort_values('Difference', ascending=True)[0:10]
bot10_table.index = range(1,11)
Title Audience Score Critic Score Difference Genres
1 King Kong -1.8 0.9 -2.7 Drama Adventure Action
2 Rocky -0.7 1.4 -2.1 Sport Drama
3 E.T. the Extra-Terrestrial 0.1 2.0 -1.9 Sci-Fi Family Western
4 War of the Worlds -1.5 0.3 -1.9 Thriller Sci-Fi Adventure
5 Charlie and the Chocolate Factory -1.3 0.5 -1.8 Comedy Adventure Family
6 Titanic -0.5 1.1 -1.6 Romance Drama
7 Back to School -1.3 0.3 -1.6 Sport Romance Comedy
8 The Incredibles -0.2 1.4 -1.6 Animation Adventure Action
9 Shakespeare in Love -0.2 1.4 -1.6 Romance Drama Comedy
10 Toy Story 2 0.1 1.6 -1.5 Comedy Animation Adventure


I think you could spend a lot of time discussing what the best way of interpreting the audience and critic scores - it's a strange case where we have a seemingly intuitive score (a 10-point scale), but we don't know if our populations (critics and audiences) are using them the same. It's nice that for the most part, this difference in interpretation doesn't change much in terms of what genres or movies audiences and critics disagree on most.

Comments !