While descriptive statistics summarize the characteristics of a dataset, inferential statistics allow you to make conclusions and predictions based on the data. Descriptive statistics allow you to describe a dataset, while inferential statistics allow you to make inferences based on a dataset.
When you have collected data from a sample, you can use inferential statistics to understand the larger population from which the sample is taken. Generally with research, it is not possible for us to collect data from every individual in the world, so we collect data from a sample of individuals and then use our results from our sample to make predictions about what we would expect to see for all individuals based on our results. However, unlike this oversimplified explanation, you likely have specific groups of individuals (i.e., specific populations) you are interested in researching (for example: all women over the age of 35, all employed adults in the US, all Christian families in Texas, all pediatric doctors in Mexico, etc.) so your population would be all individuals who fit your criteria of interest, not just all individuals in general in the entire world.
Inferential statistics have 2 main uses:
(Information adapted from Bhandari, P. (2023). Inferential Statistics | An Easy Introduction & Examples. Scribbr).
Correlations are a simple statistical analysis that are great for helping you predict values of one variable based on another.
Correlations are a measure of how strongly 2 variables are related to each other. The number you will see in a correlation analysis will represent the strength of the relationship between the 2 variables. Correlations range from -1 to +1, and values closer to either -1 or +1 signify stronger relationships. A correlation of 0 means no relationship.
Positive correlations mean as one of the variables increases, so does the other. Negative correlations means as one variable increases, the other decreases.
Let’s run a correlation!
As a note, the Pearson bivariate correlation (bivariate just means 2 variables) is the most common type of correlation you will come across in research, though you will often just see it simply referred to as a "correlation" or "correlational analysis." There are also other types of correlations, such as the Spearman rank-order correlation, however, for most intents and purposes with quantitative data, the Pearson correlation is the one you will likely use. This is because the Pearson correlational analysis is for Scale data, whereas the Spearman rank order is for Ordinal (rank-ordered) data. It is not advised to run correlations on Nominal data, it will not give meaningful results as the numeric values of Nominal variables just represent categories.
As another note, correlations only measure linear relationships, not parabolic, cubic, or other non-linear relationships. So even though your correlational analysis may not be statistically significant, it is possible that the variables you are looking at relate to each other in some other way. (Two variables can be perfectly related, but if the relationship is not linear, a correlation coefficient is not an appropriate statistic for measuring their association).
As another example, let's run a correlation with 4 variables: Age, Salary, Years Employed, and Anxiety 1. Follow the steps above, but this time in Step 2, move Age, Salary, Years Employed, and Anxiety 1 over to the Variables box. Then click OK to run the analysis. See the resulting output table below. (You can add as many variables as you want to a correlational analysis, but keep in mind that the resulting correlation table will get increasingly larger with the more variables you add, and it may consequently become more difficult for you to read the table accurately).
When you check the box for Show only the lower triangle, the resulting correlation table will not display the mirrored-image, identical values in the upper portion of the table. You can see what this looks like for our Age and Salary correlation table below. Now we see blank cells directly underneath the Salary column where it intersects with the Age row because these cells would contain the exact same values as the cells in the Salary row where it intersects with the Age column. These repeated/mirrored values in the upper "triangle" of the table are not shown.
The effect of showing only the lower triangle becomes even more apparent in larger correlation tables with more variables; take a look at the correlation table below with Age, Salary, Years Employed, and Anxiety 1. (The top table does not have the box checked for showing only the lower triangle. The bottom table does have the box checked).
Notice how much easier it is to read the table when we check the box for Show only the lower triangle. As the upper triangle just repeats/mirrors the values in the lower triangle, you do not need to have the upper triangle visible to be able to interpret your correlation results.
Generally, you’ll want to use the Two-Tailed test of significance (also called a two-tailed p-value) because a two-tailed test will test for any relationship between the 2 variables. A One-Tailed test only tests for one specific direction (either positive or negative, but not both), and you would have had to make a hypothesis about the specific direction you expected to see prior to running the analysis in order to use a one-tailed test of significance (i.e., a one-tailed p-value). A two-tailed test tests for both positive or negative relationships, so if you don’t know how the variables may relate and just want to know if they relate, use the two-tailed test.
When in doubt, it is almost always more appropriate to use a two-tailed test. A one-tailed test is only justified if you have a specific prediction (hypothesis) about the direction of the difference (e.g., Age being positively correlated with Salary), and you are completely uninterested in the possibility that the opposite outcome could be true (e.g., Age being negatively correlated with Salary).
Another useful option/setting that you can play around with is the Style settings of the correlation table.
Note: We selected Significance when we were adjusting the Style settings, but you could instead select Correlation and set a specific value of correlation coefficient for SPSS to then highlight in your table. You can set the condition to specify highlighting the cells that have a correlation coefficient equal to or higher than your specified value. Or you could have both a Significance condition and a Correlation condition - if you click Add in the Style settings window, you can add multiple conditions.
A t-test is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether an intervention or treatment actually has an effect on the people within the study, or whether two groups are different from one another.(https://www.scribbr.com/statistics/t-test/)
If the groups come from a single population (so if you’re measuring before and after an experimental treatment), conduct a paired-samples t-test. This is a within-subjects design.
If the groups come from a 2 different populations (for example, men and women, or people from two separate cities, or students from two different schools), conduct an independent-samples t-test. This is a between-subjects design.
One-sample t-test is for comparing one group to a standard value or norm (like comparing acidity of a liquid to a neutral pH of 7).
Let's go over how to conduct a paired samples t-test. We'll use the variables Time Task 1 and Time Task 2 in this analysis to see if our sample's times for finishing the running race improved from the first time they ran the race (Time 1) to the second time (Time 2). We are using a paired samples t-test for this analysis because we are examining the same individuals across the two time points.
Also known as a Two Samples t-Test, an independent t-Test is
Multivariate analyses consist of analyzing more than one dependent variable at once. This is in contrast to all of the above Univariate analyses which only include one dependent variable per analysis.
To conduct multivariate analyses in SPSS, you will often need to use the General Linear Model function.
Copyright © Baylor® University. All rights reserved.
Report It | Title IX | Mental Health Resources | Anonymous Reporting | Legal Disclosures