File Name: correlation and regression analysis in research methodology .zip
- Service Unavailable in EU region
- Correlation Analysis
- Introduction to Correlation and Regression Analysis
Service Unavailable in EU region
Submit Search. Home Explore. Successfully reported this slideshow. We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare. Like this document? Why not share! Embed Size px. Start on. Show related SlideShares at end. WordPress Shortcode. Published in: Engineering. Full Name Comment goes here. Are you sure you want to Yes No.
Be the first to like this. No Downloads. Views Total views. Actions Shares. No notes for slide. Research Methodology Module 1. A model of the relationship is hypothesized, and estimates of the parameter values are used to develop an estimated regression equation. Various tests are then employed to determine if the model is satisfactory.
If the model is deemed satisfactory, the estimated regression equation can be used to predict the value of the dependent variable given values for the independent variables. Regression model. If the error term were not present, the model would be deterministic; in that case, knowledge of the value of x would be sufficient to determine the value of y.
Least squares method. Either a simple or multiple regression model is initially posed as a hypothesis concerning the relationship among the dependent and independent variables. The least squares method is the most widely used procedure for developing estimates of the model parameters.
As an illustration of regression analysis and the least squares method, suppose a university medical centre is investigating the relationship between stress and blood pressure. Assume that both a stress test score and a blood pressure reading have been recorded for a sample of 20 patients. The data are shown graphically in the figure below, called a scatter diagram.
Values of the independent variable, stress test score, are given on the horizontal axis, and values of the dependent variable, blood pressure, are shown on the vertical axis. Correlation and regression analysis are related in the sense that both deal with relationships among variables. The correlation coefficient is a measure of linear association between two variables.
For simple linear regression, the sample correlation coefficient is the square root of the coefficient of determination, with the sign of the correlation coefficient being the same as the sign of b1, the coefficient of x1 in the estimated regression equation.
Neither regression nor correlation analyses can be interpreted as establishing cause-and- effect relationships. They can indicate only how or to what extent variables are associated with each other. The correlation coefficient measures only the degree of linear association between two variables. Any conclusions about a cause-and-effect relationship must be based on the judgment of the analyst.
What is the difference between correlation and linear regression? Correlation and linear regression are not the same. What is the goal? Correlation quantifies the degree to which two variables are related. Correlation does not fit a line through the data points.
You simply are computing a correlation coefficient r that tells you how much one variable tends to change when the other one does. When r is 0. When r is positive, there is a trend that one variable goes up as the other one goes up.
When r is negative, there is a trend that one variable goes up as the other one goes down. Linear regression finds the best line that predicts Y from X. What kind of data? Correlation is almost always used when you measure both variables. It rarely is appropriate when one variable is something you experimentally manipulate. Linear regression is usually used when X is a variable you manipulate time, concentration, etc.
With correlation, you don't have to think about cause and effect. It doesn't matter which of the two variables you call "X" and which you call "Y". You'll get the same correlation coefficient if you swap the two. The decision of which variable you call "X" and which you call "Y" matters in regression, as you'll get a different best-fit line if you swap the two.
The line that best predicts Y from X is not the same as the line that predicts X from Y however both those lines have the same value for R2 Assumptions The correlation coefficient itself is simply a way to describe how two variables vary together, so it can be computed and interpreted for any two variables.
Further inferences, however, require an additional assumption -- that both X and Y are measured, and both are sampled from Gaussian distributions. This is called a bivariate Gaussian distribution. If those assumptions are true, then you can interpret the confidence interval of r and the P value testing the null hypothesis that there really is no correlation between the two variables and any correlation you observed is a consequence of random sampling.
With linear regression, the X values can be measured or can be a variable controlled by the experimenter. The X values are not assumed to be sampled from a Gaussian distribution. The vertical distances of the points from the best-fit line the residuals are assumed to follow a Gaussian distribution, with the SD of the scatter not related to the X or Y values.
Relationship between results Correlation computes the value of the Pearson correlation coefficient, r. Linear regression quantifies goodness of fit with r2, sometimes shown in uppercase as R2.
Introduction Factor analysis attempts to represent a set of observed variables X1, X2 …. Xn in terms of a number of 'common' factors plus a factor which is unique to each variable. The common factors sometimes called latent variables are hypothetical variables which explain why a number of variables are correlated with each other -- it is because they have one or more factors in common.
A concrete physical example may help. Say we measured the size of various parts of the body of a random sample of humans: for example, such things as height, leg, arm, finger, foot and toe lengths and head, chest, waist, arm and leg circumferences, the distance 4.
We'd expect that many of the measurements would be correlated, and we'd say that the explanation for these correlations is that there is a common underlying factor of body size. It is this kind of common factor that we are looking for with factor analysis, although in psychology the factors may be less tangible than body size. To carry the body measurement example further, we probably wouldn't expect body size to explain all of the variability of the measurements: for example, there might be a lankiness factor, which would explain some of the variability of the circumference measures and limb lengths, and perhaps another factor for head size which would have some independence from body size what factors emerge is very dependent on what variables are measured.
Even with a number of common factors such as body size, lankiness and head size, we still wouldn't expect to account for all of the variability in the measures or explain all of the correlations , so the factor analysis model includes a unique factor for each variable which accounts for the variability of that variable which is not due to any of the common factors.
Why carry out factor analyses? If we can summarise a multitude of measurements with a smaller number of factors without losing too much information, we have achieved some economy of description, which is one of the goals of scientific investigation.
It is also possible that factor analysis will allow us to test theories involving variables which are hard to measure directly. Finally, at a more prosaic level, factor analysis can help us establish that sets of questionnaire items observed variables are in fact all measuring the same underlying factor perhaps with varying reliability and so can be combined to form a more reliable measure of that factor.
There are a number of different varieties of factor analysis: the discussion here is limited to principal axis factor analysis and factor solutions in which the common factors are uncorrelated with each other. It is also assumed that the observed variables are standardized mean zero, standard deviation of one and that the factor analysis is based on the correlation matrix of the observed variables. The coefficients a11, a12 … anm are weights in the same way as regression coefficients because the variables are standardised, the constant is zero, and so is not shown.
For example, the coefficient a11 shows the effect on variable X1 of a one-unit increase in F1. In factor analysis, the 5. In the model above, a11 is the loading for variable X1 on F1, a23 is the loading for variable X2 on F3, etc.
When the coefficients are correlations, i. This is called the communality. The larger the communality for each variable, the more successful a factor analysis solution is. Why use factor analysis? Factor analysis is a useful tool for investigating variable relationships for complex concepts such as socioeconomic status, dietary patterns, or psychological scales. It allows researchers to investigate concepts that are not easily measured directly by collapsing a large number of variables into a few interpretable underlying factors.
Request PDF | On Jan 1, , R. Shi and others published Correlation and regression analysis | Find, read and cite all the research you need on.
Introduction to Correlation and Regression Analysis
Regression analysis is a quantitative research method which is used when the study involves modelling and analysing several variables, where the relationship includes a dependent variable and one or more independent variables. In simple terms, regression analysis is a quantitative method used to test the nature of relationships between a dependent variable and one or more independent variables. The formulae for regression equation would be. Do not be intimidated by visual complexity of correlation and regression formulae above. Linear regression analysis is based on the following set of assumptions:.
Methods of correlation and regression can be used in order to analyze the extent and the nature of relationships between different variables. Correlation analysis is used to understand the nature of relationships between two individual variables.