The objective of the first assignment is to determine if TJX’s stock returns are normally distributed. We pulled pricing data from 899 days before the corporate data breach and 100 days after the event and then conducted several tests in order to draw a conclusion about the distribution of TJX’s returns. The KS test for the 1,000 observations suggested TJX’s stock returns are not normally distributed. The F test suggested that the volatility of at least two samples is significantly different. We conclude that returns of TJX are not normally distributed. Introduction
Normal distribution is one of the most important statistical distributions as it is used to draw conclusions from sample data about the populations from which theses samples are drawn from. This distribution also has some important characteristics, such as the normal distribution is symmetrical about its mean. Also, the normal distribution provides a benchmark of how the data is dispersed; the normal distribution states that 99.73% of the probability density function lies within three standard deviation of the mean. The test of normality will help analyze other statistical feature of TJX’s stock return. Method
In our analysis, we conducted a KS test to determine if the total number of observations was drawn from a normal distribution firstly. Then we took the 1,000 observations and broke them into subsets to test if these smaller samples were normally distributed. Lastly, we tested the behavior of the sample means and the sample variances to ensure they were both coming from the same distribution. The Kolmogorov-Smirov test is a goodness of fit test; it measures the discrepancies between the observed values and the values expected under the model in question. In this assignment we will use the KS test to determine if TJX’s stock returns are normally distributed. First we will calculate the mean and standard deviation for the 1,000 observations and also for the individual 10 samples of 100 observations. The Mean: i=1nuin
Then we will need to standardize the 1,000 observations to a standard normal distribution and use the Z score to determine the cumulative probably. We then use the mean and variance calculated above to normalize our observations. The Z score transforms the observations into their theoretical cumulative probability. We will then take the maximum difference between the theoretical and actual probably distribution to conduct a KS test to determine if the returns are normal. Normal Z score: z=x-μσ
After normalizing, we took the absolute difference between the actual and theoretical cumulative distribution function. We then will consult the KS table to determine if at a 95% confidence level the critical D value is greater or less than the maximum difference, D. We will then use the below formula to test the difference between two means. This test will look at two means from our sample sizes to see if they are statistically different. t=x1 – x2 – μ1-μ2(s12n1 +s22n2)1/2
Lastly, we will use the F distribution to compare two sample variances to see if they are statistically different.
Using the KS test on the 1,000 observations we determined this sample was not normally distributed. The actual D value we calculated in our test was .051966. Using a sample size of 1,000 and a 95% confidence interval we determined the critical D value was .043000. Since the actual D value was larger than the critical D value, we reject the null hypothesis and conclude that returns are not normally distributed. (See Appendix 1) We furthered our analysis by using the KS test to determine if smaller subsets of our sample stock returns were normally distributed. At the 95% confidence interval, the critical D value was .1360. Using the output in Appendix 2, we found that 9 out of the 10 sample sizes appear to be normally distributed. Next we tested the means from the different sample sizes to determine if they are statistically different (See Appendix 2). The largest mean was from the 1 – 101 sample and was .002266. The smallest mean is from the 201 – 300 sample and was -.001589. We have determined from the previous KS test that both these samples appear to be normally distributed. We implemented the difference between two means test and calculated our t statistic of 1.494204.
Using 198 degrees of freedom and a 95% confidence interval the critical t value from the tables is 1.652586. Since the actual value is less than the critical value, we fail to reject the null hypothesis and conclude there is no significant difference of the means. The two samples tested are coming from the same distribution. Our team used another way to test if the means from the samples were coming from the same distribution. We wanted to see if the company’s 95% confidence intervals for the population mean overlap using the sample mean from each subset and the standard error of the mean. Using 1.96 as our critical t value and .001398 as the standard error of the mean, our team calculated the below lower and upper bounds at the 95% confidence interval. We determined that these all overlap and imply no significant difference of the means.
Therefore, the samples appear to be coming from the same distribution. (See Appendix 3) To give further insight into whether the samples were coming from the same distribution, our team also looked at that sample variances to see if the volatility of the subsets were equal (See Appendix 2). We used the F test to see if the sample variances were statistically different. The highest variance was from the 1 – 100 sample and was .000410. The lowest variance was from the 401 -500 sample and was .000105. Using this information we calculated an F statistic of 3.908839. The critical F at the 95% confidence with 99 degrees of freedom in the denominator and the numerator was 1.394061. Since the critical F is less than the F statistic, we reject the null hypothesis and conclude that variances of these two samples are significantly different. Conclusion
* The KS test for the 1,000 observations suggested TJX’s stock returns are not normally distributed. * The KS test for the 10 samples of 100 observations suggested that 9 of the 10 sample sizes were normally distributed. * The test for the difference between two means, and the range of the 95% confidence intervals for all the samples, suggest that they are coming from the same distribution. * The F test suggested that the volatility of at least two samples is significantly different. Discussion:
According to the KS test applied to our samples, it was interesting that the sample from 501-600 did not appear to be normally distributed. Neither did the whole 1000 observations. To illustrate the conclusion, we graph TJX’s stock returns to see if the data appeared to be normally distributed. The data seemed to be shaped in a bell-curve manner which is in line with the shape of the normal distribution. However, it appears to be fat-tail compared to the index’s distribution.
Using Eviews, our team conducted a few additional statistical techniques to test for normality for entire data set, as Table 1. First we looked at the skewness of the data set; this measures the symmetry. The normal distribution curve has a skewness of zero. As seen below the stock has a skewness of 0.3752 while the index return has a skewness of -.1588. The positive amount shows that the stock’s data is skewed right which implies positive skewness. Kurtosis measures if the data is peaked or flat relative to a normal distribution. The kurtosis of a standard normal distribution is 3. The stock’s kurtosis is 6.266 while the index’s is 3.7822. These positive values infer a “peaked” distribution. The stock has a high kurtosis which means the data sets has a distinct peak near the mean, declines rather rapidly and has heavy tails. Also J-B statistic with 449.4175 indicates the distribution is not normal.