The models fit the global temperature spot-on within the measurement and random noise accuracy. You can also remove the ENSO signal etc. (a la Foster & R.) and get the same result. All said.
Everything else is vain babble to no avail.
I agree models fit temperature close enough compared to accuracy and noise considerations. Discussing the small differences that do exist is not vain babble, but may lead to learning something about the climate.
The linked article (and two associated images) discusses a new (more powerful) climate model evalution package to better determine just how accurate each new generation of projections are:
Peter J. Gleckler, Charles Doutriaux, Paul J. Durack, Karl E. Taylor, Yuying Zhang, Dean N. Williams, Erik Mason, and Jérôme Servonnat (May 2016), "A More Powerful Reality Test for Climate Models", EOS.
https://eos.org/project-updates/a-more-powerful-reality-test-for-climate-modelsExtract: "Projections of climate change are based on theory, historical data, and results from physically based climate models. Building confidence in climate models and their projections involves quantitative comparisons of simulations with a diverse suite of observations. Climate modelers often consider information from well-established tests and comparisons among existing models to help decide on a new model version among multiple candidates.
Climate model developers and those who use these models benefit from sharing information with each other. Both groups require access to the best available data and rely on open source software tools designed to facilitate the analysis of climate data. Developers benefit the most from comparisons of their models with observations and other models when the results of such analysis can be made quickly available.
Here we introduce a new climate model evaluation package that quantifies differences between observations and simulations contributed to the World Climate Research Programme’s Coupled Model Intercomparison Project (CMIP). This package is designed to make an increasingly diverse suite of summary statistics more accessible to modelers and researchers."
Caption for first image: "(top) Observed and (bottom) simulated seasonal mean (December–January–February) 2-meter surface air temperature data. The observational estimate is taken from surface instrument records, and the model result in an ensemble average of results from more than 20 climate models contributed to the Coupled Model Intercomparison Project. Credit: Peter J. Gleckler/LLNL"
Caption for second image: "Fig. 1. The Coupled Model Intercomparison Project Phase 5 (CMIP5) facilitates the comparison of results from various climate models. Shown here are relative error measures of different developmental tests of the National Oceanic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory (GFDL) model. Results are based on the global seasonal cycle climatology (1980–2005) computed from Atmospheric Model Intercomparison Project (AMIP) experiments. Rows and columns represent individual variables and models, respectively. The error measure is a spatial root-mean-square error (RMSE) that treats each variable separately. The color scale portrays this RMSE as a relative error by normalizing the result by the median error of all model results [Gleckler et al., 2008]. For example, a value of 0.20 indicates that a model’s RMSE is 20% larger than the median error for that variable across all simulations on the figure, whereas a value of –0.20 means the error is 20% smaller than the median error. The four triangles in grid square show the relative error with respect to the four seasons (in clockwise order, with December–January–February (DJF) at the top; MAM = March–April–May, JJA = June–July–August, and SON = September–October–November). The reference data sets are the default satellite and reanalysis data sets identified by Flato et al. [2013]. TOA = top of atmosphere, SW = shortwave, LW = longwave. Credit: Erik Mason/GFDL"