Sources and other ideas

Sources and further steps

The dataset on which we based this datastory is the CMU Movie Summary Corpus.

Additional datasets

As we are computing a success rating for each movie, we need to take into account the reviews, the inflation adjusted profits to be able to compare the profitability of each movie and the Oscars nominations and awards. This data is not available in the provided datasets. We use these additional datasets in order to get them:

Additional ideas

These are some ideas that we wanted to implement, but we need more ressources and time.

  • Extract sentiment analysis score of a movie from plot.
  • Extract sentiment score of movie reviews and tweets.
  • Multiple analysis on the Stanford CoreNLP dataset.