  Choose your language

# Statistics

Distribution of Variables

In the context of ordinary least squares (OLS) regression analysis, the key variable is the regressand; in this case, 'index'. For hypothesis testing to be valid, the error term in the model must be independently, identically normally distributed. With the regressors assumed to be constants from sample to sample, 'index' should also, therefore, be normally distributed. Two points might be made at this stage. First, there is no assumption regarding the distribution of the independent variables in the analysis, as should be clear from the frequent use of dummy variables. Second, the normality assumption is only required for the purposes of hypothesis testing. The mathematical estimation of the OLS coefficients can be undertaken whether this is true or not.

Potential problems and their resolution

There are a number of possible caveats to the foregoing exercise. The first is quite simply that it was undertaken without a justified theoretical underpinning and it is therefore difficult to consider questions of misspecification in other than a purely statistical sense. Second, some of the regressors, in particular perhaps 'sales', 'profit margin', 'ROCE' and 'current' are potentially volatile over time and might better be replaced with their averages over some backward looking horizon, say of five or ten years. Third, there is no real justification for assuming that a linear specification is the appropriate one to adopt. Fourth, there is an evident lack of variability in the dependent variable 'index', which has a very low coefficient of variation (standard deviation divided by the mean) of 2.5 per cent. One possible way to overcome this problem might be to increase the sample size, whether by further Greek companies or by international comparators. Finally, while the apparent errors are generally small, 'index' is not measured as simply 'actual'/'max' and this may be a cause of at least some of the problems encountered.

Related Links

To Top