Regression analysis is a heavyweight analysis model in SPSS, and the standardized residuals are used to observe the degree of fit between variables and the regression model. By using residual values, we can identify hidden extreme cases in the data. In fields such as medicine and genetic analysis, researchers often use standardized residuals to identify special cases or covariates in various analysis results, and then conduct in-depth research on these special cases.
The process of calculating standardized residual values in SPSS is very simple, just use the regression linear command. Below, I will demonstrate the specific steps of standardizing residuals using social research data.
1. Open the analysis file. As shown in the figure, the first step in entering the SPSS analysis software is to switch to the [File] tab and use the [New, Open, or Import Data] command to import the collected raw data file into SPSS.
Figure 1: Open the analysis file
2. Regression - Linear. After the data import is successful, we can manually check the accuracy of the data to see if there are any missing or omitted data. After completing the inspection, switch to the 'Analysis' tab and use the' Regression Linear 'command within it.
Figure 2: Regression Linear
3. Set dependent and independent variables. After entering the linear regression settings window, we can select the dependent and independent variables to be analyzed from the variable list on the left. It should be noted that only data variables can be selected, and variables of other categories cannot be selected. I will take 'annual income' as the dependent variable to be analyzed, and set 'age', which affects annual income, as the independent variable. If there are multiple independent variables to be analyzed, you can click the [Next] button and place other variables in the pop-up independent variable selection box.
Figure 3: Setting dependent and independent variables
4. Statistical settings. After successfully selecting the dependent and independent variables, we click the statistics button on the right to check the "Regression Coefficient" option and the "Residual" test method. There are two types of residuals: Debin Watson and case diagnosis. Debin Watson can check whether the residuals are independent, while case diagnosis can help us select outliers and peaks in the variables. Here, I have chosen individual case diagnosis and set the goal of selecting three outliers.
Figure 4: Statistical Settings
5. Save settings. After entering the save settings interface, we checked the "Standardization" option below the residual and the "Average" option in the prediction interval. Below the residual option, there are also options such as "Non standardized, Student oriented, After deletion, Student oriented after deletion". Everyone can try them out independently in practical operation. After completing the selection, you can return to the main interface and click the 'OK' command at the bottom to start the analysis.
Figure 5: Save Settings
How to view the SPSS standardized residual plot?
After explaining the operation process of standardized residuals, I will now explain how to interpret the standardized residual graph output by SPSS. Through charts and different parameters, we can understand the basic distribution state of variable data.
1. Normal P-P plot. As shown in the figure, below the SPSS result viewer, there is a normalized residual P-P plot, where normal case parameters are distributed around the normal curve in sequence. If there are outliers, they will move away from the normal line. As can be seen, these data points that are floating around the periphery of the line require our focused attention.
Figure 6: Normal P-P plot
2. Standard residual. Through the residual statistics table, we can view the absolute values of the standard residuals for each parameter. If the absolute value is less than 2, it indicates that the model prediction is relatively accurate; If it is greater than 2, it should be noted that there are suspicious outliers in the file. If the absolute value of the minimum value shown in the figure is greater than 2, it indicates that the value is abnormal, and we need to return the data text for adjustment.
Figure 7: Standard Residual
3. Histogram. In addition, through the histogram in the chart, we can also obtain the low peak valley peak distribution status of the values. By optimizing these values, our analysis results can become more stable. The data shown in this graph still conforms to the conventional linear distribution state, indicating that only a few extreme data points need to be optimized.
Figure 8: Histogram








