On movies
My favorites
I could simply list the movies I have rated 9/10 or 10/10 throughout my life, but that would be not only underwhelming, it would probably be a bit boring too. Also, bear in mind that some of my all-time favorites are widely celebrated and universally acknowledged masterpieces. Naturally, I am talking about the Lord of the Rings trilogy, The Matrix, The Godfather I & II, Lawrence of Arabia, The Shawshank Redemption, and films by legends like Kubrick, Kurosawa, Hitchcock, and so on.
Instead, allow me to share five outstanding films which, while possibly relatively well-known and having garnered a fair amount of attention, I feel not everyone has had the chance to watch or, in some cases, even heard of. And just to be clear, these five are what came to mind today. Ask me again next month and I might have a different list. Let’s just say these are the ones that, with a little effort in memory recollection, floated to the surface of my thoughts.
Go watch them!
12 Monkeys (1995)

12 Monkeys is directed by Terry Gilliam, best known for his work with Monty Python. It’s an adaptation of the 1962 French short film La Jetée (directed by Chris Marker), a haunting piece composed entirely of still photographs.
Whenever I’m asked about my favorite time-travel movie, 12 Monkeys is always my answer. Gilliam is a master at making us feel the paranoia that consumes his characters. His framing and visual style often evoke discomfort and disorientation. Don’t believe me? Watch Brazil, another one of his sublime works.
What makes 12 Monkeys so compelling to me, and why I have watched it countless times, is how it strips time travel of its usual glamor. Instead, it presents a bleak, chaotic reality where time travel feels more like a punishment: ugly, confusing, and deeply unsettling. Brad Pitt gets most of the spotlight for his Oscar-nominated performance as a manic lunatic, but it is Bruce Willis who, in my opinion, truly carries the emotional weight. His portrayal of a man burdened by the impossible task of correcting the past is what sold me on the film’s tragic brilliance.
A Separation (2011)

I remember the first time I watched this film it felt like I was witnessing a documentary unfold. Nothing about it seems scripted, rehearsed, or even planned, and I mean that in the most positive way. It plays out as if a hidden camera were quietly capturing the steps, misfortunes, and emotional struggles of an ordinary family in Iran, desperately trying (or not?) to find balance within their household. Make no mistake: you won’t find happiness here. A Separation is an incredibly sad film, but one that absolutely deserves to be seen. I believe there is something deeply human to be learned from it…
The Secret in Their Eyes (2009)

Let the Bullets Fly (2010)

The Hunt (2012)

IMDb Data Analyzer
There was a time when IMDb provided user statistics within our profiles, but unfortunately, that’s no longer the case.
So, I’ve decided to code a basic (and far from perfect) Python script to bring those statistics back to life, along with some additional ones.
If you have any suggestions for the code or the types of statistics to include, please let me know.
You can check out the code here
To the best of my knowledge, GitHub Pages, Jekyll, and Jupyter Notebooks do not integrate seamlessly.
To display the notebook shown below, I used the Anaconda Prompt to generate a markdown file by running the following command:
jupyter nbconvert --to markdown < file name >.ipynb
Between this method and converting the notebook to HTML, I prefer the former.
Import libraries and load the CSV file with ratings.
You can obtain the ratings file from your IMDb user account.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import date
ratings = pd.read_csv("ratings.csv")
print("Last updated:" + str(date.today()))
Last updated:2025-11-07
Parsing data.
Essentially, I will count the occurrences of each instance.
I am interested only in my own ratings, IMDb ratings, release year, and genres.
Please note that one movie can belong to multiple genres.
movie_idxs = ratings.index[ratings['Title Type']=='Movie'].tolist()
my_movie_ratings = ratings.iloc[movie_idxs,1]
counted_ratings = np.unique(my_movie_ratings, return_counts=True)
imdb_ratings = ratings.iloc[movie_idxs,7]
my_movie_years = ratings.iloc[movie_idxs,9]
counted_years = np.unique(my_movie_years, return_counts=True)
my_movie_runtimes = ratings.iloc[movie_idxs,8]
counted_runtimes = np.unique(my_movie_runtimes, return_counts=True)
my_movie_genres_aux = ratings.iloc[movie_idxs,10].str.split(", ") # the space after the comma is important
my_movie_genres = [item for sublist in my_movie_genres_aux for item in sublist]
counted_genres = np.unique(my_movie_genres, return_counts=True)
Plotting the distribution of my ratings.
The total number of movies I have given each score is indicated on top of each bar.
plt.figure(figsize=(6, 6), frameon=True)
plt.bar(counted_ratings[0], counted_ratings[1], color ='#4682B4', width = 0.5, edgecolor='wheat', linewidth=2)
plt.xlabel("My Rating")
plt.ylabel("Number of movies")
plt.xticks(counted_ratings[0])
for i in range(len(counted_ratings[0])):
plt.text(counted_ratings[0][i], counted_ratings[1][i] + 10,
str(counted_ratings[1][i]))
plt.title("I have watched a total of %d movies" %(sum(counted_ratings[1][:])))
plt.show()

Plotting information about the genres.
For better visualization, I will group less-watched genres into Others and limit the pie chart to eight wedges.
The popularity of each genre in the Others category is listed in the table below.
genres_dataframe = pd.DataFrame(
data = {
'Genre': counted_genres[0].tolist(),
'value' : counted_genres[1].tolist()},
).sort_values('value', ascending = False)
# The top 7 most watched genres
genres_dataframe_most_watched = genres_dataframe[:7].copy()
# The remaining genres are summed up altogether
genres_dataframe_others = pd.DataFrame(data = {
'Genre' : ['Others'],
'value' : [genres_dataframe['value'][7:].sum()]
})
# Combine dataframes
genres_dataframe_compact = pd.concat([genres_dataframe_most_watched, genres_dataframe_others])
# I like blue :)
color_shades = ['#728FCE','#4863A0','#2B547E','#36454F', '#29465B','#2B3856','#123456', '#151B54']
fig, ax = plt.subplots(figsize=(6, 6))
patches, texts, pcts = ax.pie(
genres_dataframe_compact['value'], labels=genres_dataframe_compact['Genre'].tolist(), autopct='%.1f%%',
colors=color_shades,
wedgeprops={'linewidth': 2.0, 'edgecolor': 'wheat'},
textprops={'size': 'large', 'fontweight': 'bold'})
# Set the corresponding text label color to the wedge's face color.
for i, patch in enumerate(patches):
texts[i].set_color(patch.get_facecolor())
plt.setp(pcts, color='white')
plt.setp(texts, fontweight='bold')
ax.set_title('Most watched genres', fontsize=18)
plt.tight_layout()
plt.show()
genres_dataframe_less_watched = genres_dataframe[7:].copy()
genres_dataframe_less_watched.rename(columns = {'value':'Movies watched'}, inplace = True)
genres_dataframe_less_watched.style.set_table_styles(
[
{
'selector': 'th',
'props': [('background-color', '#D3D3D3')]
},
]
)
print(genres_dataframe_less_watched.to_string(index=False))

Genre Movies watched
Sci-Fi 417
Romance 307
Fantasy 290
Horror 282
Family 182
Biography 134
Animation 122
War 85
History 82
Western 37
Sport 36
Documentary 34
Music 31
Musical 29
Film-Noir 15
Talk-Show 1
In the following, I will plot the average rate for each genre.
plt.figure(figsize=(6, 6))
plt.xlabel("Genre")
plt.ylabel("Average rating")
my_avg_genre_rating = np.zeros([len(counted_genres[0])]) # Empty array
imdb_avg_genre_rating = np.zeros([len(counted_genres[0])]) # Empty array
for i, genre in enumerate(counted_genres[0]):
for k in range(len(ratings['Genres'])):
if (genre in str(ratings['Genres'][k])) and str(ratings['Title Type'][k]) == 'Movie':
if genre == 'Music' and 'Musical' in str(ratings['Genres'][k]):
# I am sure there's a better way to perform these loops :)
pass
else:
my_avg_genre_rating[i] += ratings['Your Rating'][k]
imdb_avg_genre_rating[i] += ratings['IMDb Rating'][k]
my_avg_genre_rating[i] = my_avg_genre_rating[i]/counted_genres[1][i]
imdb_avg_genre_rating[i] = imdb_avg_genre_rating[i]/counted_genres[1][i]
x_axis = np.arange(len(counted_genres[0]))
plt.bar(x_axis + 0.2, my_avg_genre_rating, width=0.4, label = 'Mine', color="#002c5a",edgecolor="sienna")
plt.bar(x_axis - 0.2, imdb_avg_genre_rating, width=0.4, label = 'IMDb', color="#96caff",edgecolor="maroon")
plt.xticks(x_axis,counted_genres[0])
plt.xticks(rotation=90)
plt.legend()
plt.grid(color = 'green', linestyle = '--', linewidth = 0.5)
plt.grid(axis = 'x')
plt.ylim([0, 10])
plt.show()

As somewhat expected for someone born in 1990, I have been watching more recent movies.
Additionally, only a few classics have stood the test of time.
plt.figure(figsize=(6, 6))
plt.bar(counted_years[0], counted_years[1], color ='#4682B4', width = 0.5, edgecolor="maroon",linewidth = 0.3)
plt.xlabel("Release year")
plt.ylabel("Number of movies")
plt.title("Release year of movies I've watched")
for i in range(0, len(counted_years[0]),5):
plt.text(counted_years[0][i], counted_years[1][i] + 0.5,
str(counted_years[1][i]),color='crimson',fontweight='bold')
plt.grid(color = 'blue', linestyle = '--', linewidth = 0.2)
plt.show()

Let’s now take a look at the average ratings as a function of release year.
my_avg_rating = np.empty([len(counted_years[0])]) # Empty array
imdb_avg_rating = np.empty([len(counted_years[0])]) # Empty array
for i in range(len(counted_years[0])):
idxs = my_movie_years == counted_years[0][i]
my_avg_rating[i] = np.mean(my_movie_ratings[idxs])
imdb_avg_rating[i] = np.mean(imdb_ratings[idxs])
avg_ratings_df = pd.DataFrame({
'Year': counted_years[0],
'My Average Rating': my_avg_rating,
'IMDb Average Rating': imdb_avg_rating
})
# plotting graph
ax = avg_ratings_df.plot(x="Year", y=["My Average Rating", "IMDb Average Rating"], kind="bar", figsize=(20, 6),
subplots=False,
grid=True,
color={"My Average Rating": "#002c5a", "IMDb Average Rating": "#96caff"})
ax.set_axisbelow(True)
ax.grid(color='r', linestyle='--', linewidth=0.5)
ax.grid(axis='x')

Let’s now see how aligned I am with the masses.
It seems that we generally agree on the 7’s.
plt.figure(figsize=(6, 6))
plt.scatter(my_movie_ratings, imdb_ratings, alpha=0.5, edgecolors="k")
m, b = np.polyfit(my_movie_ratings, imdb_ratings, deg=1)
# Create a sequence of 50 points from 2 to 10
sequence_x = np.linspace(counted_ratings[0][0], counted_ratings[0][-1], num=50)
# Plot regression line
plt.plot(sequence_x, b + m * sequence_x, color="dodgerblue", linestyle = '--', lw=1.5)
# Plot y = x line
plt.plot(sequence_x, sequence_x, color="darkgoldenrod", linestyle = '--', lw=1.5)
mean_rating = np.empty([len(counted_ratings[0])]) # Empty array
std_rating = np.empty([len(counted_ratings[0])]) # Empty array
for i in range(len(counted_ratings[0])):
idxs = my_movie_ratings == counted_ratings[0][i]
imdb_ratings_aux = imdb_ratings[idxs]
mean_rating[i] = np.mean(imdb_ratings_aux)
std_rating[i] = np.std(imdb_ratings_aux)
plt.scatter(counted_ratings[0], mean_rating, color="crimson",edgecolors="orange")
plt.errorbar(counted_ratings[0], mean_rating, std_rating, fmt="o", color="r")
plt.xlabel('My rating')
plt.ylabel('IMDb rating')
plt.title('How much do I agree with IMDb ratings?')
plt.grid(color = 'blue', linestyle = '--', linewidth = 0.2)
plt.show()


Macau, China