Read the dataset “Movie_Data.csv” and create a data frame.Part 0 – Use appropriate commands to understand what variables you have and what type they are. Set the seed to the last 4 digits of your UIN and select a sample of size 150 of movies from the

Movie_Data dataframe. Remove any observations for which there are missing values. For the quantitative variables, check to see if the data is normal or not. For the variables that are not normal, is it still ok to proceed with our hypothesis tests and

confidence interval creation? Why or why not?.————————————————————————————————————————————————————————-Part 1 – Mean 1) One is curious how long the average movie really is. a) Using any method you like, construct a 95% confidence interval for the mean run time

of movies. b) How do you interpret your results? 2) Movie critics claim that ratings are actually artificially inflated. The ratings in the data frame are

based on a scale of 0 to 10. If the system is unbiased the mean rating should be around 5. Test

whether or not there is significant evidence in the data to believe the population mean rating is

different than 5. a) State the two hypotheses to be tested. b) Compute the mean rating and the standard deviation of the rating. c) Determine if there is significant evident that the mean rating is different than 5.————————————————————————————————————————————————————————Part 2 – One Proportion 1) You are interested in determining the percentage of movies that are classified as action movies

a) Compute the proportion of films in your sample that are action movies. b) Construct a 95% confidence interval for the population proportion c) How do you interpret your results? d) Construct a 90% confidence interval. e) How is the new interval different than the old? Explain both because of the mathematics,

but also notionally (e.g. how is 90% confidence different that 95% and how does that affect

results. 2) A producer claims that the percentage of movies that lose money is less than 30%.

a) Create a new variable called profit which is revenue minus budget. Use this to count the

percentage of movies that lost money.

b) Restate the producer’s claim in terms of a null and alternative hypothesis. c) Use the Profit variable to compute the sample proportion

d) At a 5% significance level is there evidence to support the producer’s claim?———————————————————————————————————————————————————————Part 3 – Two Means

A person is wondering if Action movies have a higher mean revenue that Comedies Conduct a test to

see if the data supports the claim that the mean box office for Action movies is greater than those that

are Comedies. 1) State the hypotheses to be tested. 2) Compute the sample means and sample standard deviations for each set of movies. 3) At a 5% significance level, is there evidence to believe the mean for Action films is greater? 4) Are there any concerns about the test? .————————————————————————————————————————————————————————Part 4 – Two Proportions The film industry has been accused of being biased against female directors. We are going to test

whether the proportion of female director directing high budget films is different than other. 1) Compute the proportion of films directed by people who identify as female that has a budget

over 10,000,000 and the proportion of films directed by people who do not identify as female

that has a budget over 10,000,000. 2) Determine if there is significant evidence that there is a difference in proportions. 3) Any concerns about the data?————————————————————————————————————————————————————————-Part 5 – VarianceAn executive director claims that his film did not lose a lot of money relatively speaking because the

standard deviation of film revenue is $180,000,000. You believe it might be different than that. 1) State the hypotheses you are testing. 2) Conduct a test to determine if your sample variance is significantly different than the executive

director’s number and you can refute their claim.

