(Sandipan Dey, July 25, 2017)

The following appeared as a project assignment (using Open Science Framework) in the coursera course Improving your Statistical Inferences (by Eindhoven University of Technology). The project is available here.

First we need to do pre-registration to control the (type-I) error rates and reduce publication bias, as required by the OSF and shown below:

Theoretical hypothesis The theoretical hypothesis we are going to test is the following: both Satyajit Ray (from Kolkata, India) and Akira Kurosawa (from Japan) are great directors, both of them won the Academy Award for their Lifetime Achievement. Because they are both great, the movies they directed are equally good.

Dependent Variables to be measured The dependent variables to be measured are the IMDB ratings (scores), # Users rated each movie .

(scores), . First IMDB search will be used separately for the two legendary directors separately to get all the hits.

Then the search results will be sorted based on the release date, and 29 most recent full movies (excluding documentaries / TV series) will be used.

most recent full movies (excluding documentaries / TV series) will be used. So in this case, we shall use the 29 last movies Satyajit Ray and Akira Kurosawa directed in from today (excluding documentaries / TV series), the moment we did the IMDB search.

movies and directed in from today (excluding documentaries / TV series), the moment we did the IMDB search. The following table shows the data collected for Satyajit Ray movies. Movie Rating (Out of 10) #Users Rated Release Year The Stranger (Agantuk) 8.1 1760 1991 Shakha Prosakha 7.6 453 1990 Ganashatru 7.3 662 1989 Ghare Baire 7.7 812 1984 Hirak Rajar Deshe 8.8 1387 1980 Jai Baba Felunath 7.9 1086 1979 Shatranj Ke Khilari 7.8 2370 1977 Jana Aranya 8.3 887 1976 Sonar Kella 8.5 1308 1974 Distant Thunder (Ashani Sanket) 8.2 908 1973 Company Limited (Seemabaddha) 8.0 782 1971 Pratidwandi 8.2 1051 1970 Days and Nights in the Forest (Aranyer Din Ratri) 8.3 1720 1970 Goopy Gayen Bagha Bayen 8.8 1495 1969 Chiriyakhana 7.2 477 1967 Nayak 8.3 1974 1966 Mahapurush 7.3 719 1965 The Coward (Kapurush) 7.8 858 1965 Charulata 8.3 3597 1964 Mahanagar 8.3 2275 1963 Abhijaan 8.0 781 1962 Kanchenjungha 8.0 706 1962 Teen Kanya 8.2 991 1961 Devi 8.0 1407 1960 The World of Apu (Apur Sansar) 8.2 8058 1959 The Music Room (Jalshaghar) 8.1 3872 1958 Paras-Pathar 7.8 723 1958 Aparajito 8.2 7880 1956 Pather Panchali 8.4 15799 1955 The following table shows the data collected for Akira Kurosawa movies. Movie Rating (Out of 10) #Users Rated Release Year Maadadayo 7.4 4035 1993 Rhapsody in August 7.3 5131 1991 Dreams 7.8 19373 1990 Ran 8.2 84277 1985 Kagemusha 8.0 25284 1980 Dersu Uzala 8.3 18898 1975 Dodes’ka-den 7.5 4839 1970 Red Beard 8.3 12295 1965 High and Low 8.4 19989 1963 Sanjuro 8.2 22296 1962 Yojimbo 8.3 80906 1961 The Bad Sleep Well 8.1 8082 1960 The Hidden Fortress 8.1 25980 1958 The Lower Depths 7.5 3776 1957 Throne of Blood 8.1 34723 1957 I Live in Fear 7.4 3090 1955 Seven Samurai 8.7 247406 1954 Ikiru 8.3 46692 1952 The Idiot 7.4 3533 1951 Rashomon 8.3 112668 1950 Scandal 7.4 2580 1950 Stray Dog 7.9 11789 1949 The Quiet Duel 7.5 2131 1949 Drunken Angel 7.8 7422 1948 One Wonderful Sunday 7.3 1988 1947 Waga seishun ni kuinashi 7.2 2158 1946 Asu o tsukuru hitobito 6.6 119 1946 The Men Who Tread on the Tiger’s Tail 6.8 2567 1945 Zoku Sugata Sanshirô 6.2 1419 1945 Zoku Sugata Sanshirô 5.8 1229 1944

Justify the sample size We want to predict no difference , and thus we shall do a power analysis for an equivalence test . We want to be pretty sure that we can reject our smallest effect size of interest, so We shall design a study with 84 % power . For this educational assignment, we do not collect a huge amount of data.

, and thus we shall do a for an . We want to be pretty sure that we can reject our of interest, so We shall design a study with . For this educational assignment, we do not collect a huge amount of data. As long as we can exclude a large effect ( Cohen’s d = 0.8 or larger) we shall be happy for this assignment.

( or larger) we shall be happy for this assignment. The power analysis estimates that the sample size we need to show the difference between the ratings for movies directed by Satyajit Ray and Akira Kurosawa is smaller than Cohen’s d = 0.8 (assuming the true effect size is 0 , and with n α of 0.05 , when we aim for 84 % power ) is 29 movie ratings from Satyajit Ray , and 29 movie ratings from Akira Kurosawa , as can be seen from the following R code and the figures.

estimates that the we need to show the difference between the ratings for movies directed by and is smaller than (assuming the is , and with n of , when we aim for ) is movie ratings from , and movie ratings from , as can be seen from the following R code and the figures. The α α -level I found acceptable is 0.05 .

-level I found acceptable is . we performed a two-sided test.

test. we used 84 % power for this study.

for this study. The effect size expected is 0.78948 < 0.8 , as shown below.

, as shown below. Given that Satyajit Ray has a total 29 full movies directed, we can only collect 29 observations for him, also we collected equal amount of sample

data (29 movies) for each of the directors.

The following theory is going to be used for the statistical tests: Results

As can be seen from above, the sample size required to obtain 84 % power is 29 . Specify the statistical test to conduct We need to translate our theoretical hypothesis to a statistical hypothesis .

to a . Let’s calculate the (90%) CI around the effect size .

around the . When the 90 % CI falls below, and excludes a Cohen’s d of 0.8 , we can consider the ratings of the movies directed by Satyajit Ray and Akira Kurosawa as equivalent. As can be seen from the NHST test above that the effects are statistically significant , since 90 % confidence interval around the effect size does not contain 0 .

test above that the effects are , since around the effect size does not contain . Also, the TOST procedure results shown above indicates that the observed effect size d = 0.69 was not significantly within the equivalent bounds of d=-0.8 and d=0.8 , t ( 29 ) = − 2.86 , p = 0.997 .

procedure results shown above indicates that the observed effect size was within the of and , , . Also, the 90 % CI ( 0.24 , 1.14 ) around the effect size includes a Cohen’s d of 0.8 , hence, we can consider the ratings of the movies directed by Satyajit Ray and Akira Kurosawa as not equivalent .

around the effect size includes a of , hence, we can consider the ratings of the movies directed by and as . Hence, the effect is statistically significant , but not statistically equivalent .

, but not . Supporting the alternative with Bayes Factors : As can be seen from the following results, the Bayes Factor 50.17844 increases our belief in the alternative hypothesis (H1) over the null hypothesis (H0), starting with small prior belief 0.2 on the effect size .

: As can be seen from the following results, the 50.17844 increases our belief in the (H1) over the (H0), starting with belief on the . The following code is taken from the course itself and modified as required and it’s originally written / protected by © Professor Daniel Lakens, 2016 and licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/).