We discuss climate models a lot, and from the comments here and in other forums it’s clear that there remains a great deal of confusion about what climate models do and how their results should be interpreted. This post is designed to be a FAQ for climate model questions – of which a few are already given. If you have comments or other questions, ask them as concisely as possible in the comment section and if they are of enough interest, we’ll add them to the post so that we can have a resource for future discussions. (We would ask that you please focus on real questions that have real answers and, as always, avoid rhetorical excesses).

Part II is here.

What is the difference between a physics-based model and a statistical model? Models in statistics or in many colloquial uses of the term often imply a simple relationship that is fitted to some observations. A linear regression line through a change of temperature with time, or a sinusoidal fit to the seasonal cycle for instance. More complicated fits are also possible (neural nets for instance). These statistical models are very efficient at encapsulating existing information concisely and as long as things don’t change much, they can provide reasonable predictions of future behaviour. However, they aren’t much good for predictions if you know the underlying system is changing in ways that might possibly affect how your original variables will interact. Physics-based models on the other hand, try to capture the real physical cause of any relationship, which hopefully are understood at a deeper level. Since those fundamentals are not likely to change in the future, the anticipation of a successful prediction is higher. A classic example is Newton’s Law of motion, F=ma, which can be used in multiple contexts to give highly accurate results completely independently of the data Newton himself had on hand. Climate models are fundamentally physics-based, but some of the small scale physics is only known empirically (for instance, the increase of evaporation as the wind increases). Thus statistical fits to the observed data are included in the climate model formulation, but these are only used for process-level parameterisations, not for trends in time.

Are climate models just a fit to the trend in the global temperature data? No. Much of the confusion concerning this point comes from a misunderstanding stemming from the point above. Model development actually does not use the trend data in tuning (see below). Instead, modellers work to improve the climatology of the model (the fit to the average conditions), and it’s intrinsic variability (such as the frequency and amplitude of tropical variability). The resulting model is pretty much used ‘as is’ in hindcast experiments for the 20th Century.

Why are there ‘wiggles’ in the output? GCMs perform calculations with timesteps of about 20 to 30 minutes so that they can capture the daily cycle and the progression of weather systems. As with weather forecasting models, the weather in a climate model is chaotic. Starting from a very similar (but not identical) state, a different simulation will ensue – with different weather, different storms, different wind patterns – i.e different wiggles. In control simulations, there are wiggles at almost all timescales – daily, monthly, yearly, decadally and longer – and modellers need to test very carefully how much of any change that happens because of a change in forcing is really associated with that forcing and how much might simply be due to the internal wiggles.

What is robust in a climate projection and how can I tell? Since every wiggle is not necessarily significant, modellers need to assess how robust particular model results are. They do this by seeing whether the same result is seen in other simulations, with other models, whether it makes physical sense and whether there is some evidence of similar things in the observational or paleo record. If that result is seen in multiple models and multiple simulations, it is likely to be a robust consequence of the underlying assumptions, or in other words, it probably isn’t due to any of the relatively arbitrary choices that mark the differences between different models. If the magnitude of the effect makes theoretical sense independent of these kinds of model, then that adds to it’s credibility, and if in fact this effect matches what is seen in observations, then that adds more. Robust results are therefore those that quantitatively match in all three domains. Examples are the warming of planet as a function of increasing greenhouse gases, or the change in water vapour with temperature. All models show basically the same behaviour that is in line with basic theory and observations. Examples of non-robust results are the changes in El Niño as a result of climate forcings, or the impact on hurricanes. In both of these cases, models produce very disparate results, the theory is not yet fully developed and observations are ambiguous.

How have models changed over the years? Initially (ca. 1975), GCMs were based purely on atmospheric processes – the winds, radiation, and with simplified clouds. By the mid-1980s, there were simple treatments of the upper ocean and sea ice, and clouds parameterisations started to get slightly more sophisticated. In the 1990s, fully coupled ocean-atmosphere models started to become available. This is when the first Coupled Model Intercomparison Project (CMIP) was started. This has subsequently seen two further iterations, the latest (CMIP3) being the database used in support of much of the model work in the IPCC AR4. Over that time, model simulations have become demonstrably more realistic (Reichler and Kim, 2008) as resolution has increased and parameterisations have become more sophisticated. Nowadays, models also include dynamic sea ice, aerosols and atmospheric chemistry modules. Issues like excessive ‘climate drift’ (the tendency for a coupled model to move away from the a state resembling the actual climate) which were problematic in the early days are now much minimised.

What is tuning? We are still a long way from being able to simulate the climate with a true first principles calculation. While many basic aspects of physics can be included (conservation of mass, energy etc.), many need to be approximated for reasons of efficiency or resolutions (i.e. the equations of motion need estimates of sub-gridscale turbulent effects, radiative transfer codes approximate the line-by-line calculations using band averaging), and still others are only known empirically (the formula for how fast clouds turn to rain for instance). With these approximations and empirical formulae, there is often a tunable parameter or two that can be varied in order to improve the match to whatever observations exist. Adjusting these values is described as tuning and falls into two categories. First, there is the tuning in a single formula in order for that formula to best match the observed values of that specific relationship. This happens most frequently when new parameterisations are being developed. Secondly, there are tuning parameters that control aspects of the emergent system. Gravity wave drag parameters are not very constrained by data, and so are often tuned to improve the climatology of stratospheric zonal winds. The threshold relative humidity for making clouds is tuned often to get the most realistic cloud cover and global albedo. Surprisingly, there are very few of these (maybe a half dozen) that are used in adjusting the models to match the data. It is important to note that these exercises are done with the mean climate (including the seasonal cycle and some internal variability) – and once set they are kept fixed for any perturbation experiment.

How are models evaluated? The amount of data that is available for model evaluation is vast, but falls into a few clear categories. First, there is the climatological average (maybe for each month or season) of key observed fields like temperature, rainfall, winds and clouds. This is the zeroth order comparison to see whether the model is getting the basics reasonably correct. Next comes the variability in these basic fields – does the model have a realistic North Atlantic Oscillation, or ENSO, or MJO. These are harder to match (and indeed many models do not yet have realistic El Niños). More subtle are comparisons of relationships in the model and in the real world. This is useful for short data records (such as those retrieves by satellite) where there is a lot of weather noise one wouldn’t expect the model to capture. In those cases, looking at the relationship between temperatures and humidity, or cloudiness and aerosols can give insight into whether the model processes are realistic or not. Then there are the tests of climate changes themselves: how does a model respond to the addition of aerosols in the stratosphere such as was seen in the Mt Pinatubo ‘natural experiment’? How does it respond over the whole of the 20th Century, or at the Maunder Minimum, or the mid-Holocene or the Last Glacial Maximum? In each case, there is usually sufficient data available to evaluate how well the model is doing.

Are the models complete? That is, do they contain all the processes we know about? No. While models contain a lot of physics, they don’t contain many small-scale processes that more specialised groups (of atmospheric chemists, or coastal oceanographers for instance) might worry about a lot. Mostly this is a question of scale (model grid boxes are too large for the details to be resolved), but sometimes it’s a matter of being uncertain how to include it (for instance, the impact of ocean eddies on tracers). Additionally, many important bio-physical-chemical cycles (for the carbon fluxes, aerosols, ozone) are only just starting to be incorporated. Ice sheet and vegetation components are very much still under development.

Do models have global warming built in? No. If left to run on their own, the models will oscillate around a long-term mean that is the same regardless of what the initial conditions were. Given different drivers, volcanoes or CO2 say, they will warm or cool as a function of the basic physics of aerosols or the greenhouse effect.

How do I write a paper that proves that models are wrong? Much more simply than you might think since, of course, all models are indeed wrong (though some are useful – George Box). Showing a mismatch between the real world and the observational data is made much easier if you recall the signal-to-noise issue we mentioned above. As you go to smaller spatial and shorter temporal scales the amount of internal variability increases markedly and so the number of diagnostics which will be different to the expected values from the models will increase (in both directions of course). So pick a variable, restrict your analysis to a small part of the planet, and calculate some statistic over a short period of time and you’re done. If the models match through some fluke, make the space smaller, and use a shorter time period and eventually they won’t. Even if models get much better than they are now, this will always work – call it the RealClimate theory of persistence. Now, appropriate statistics can be used to see whether these mismatches are significant and not just the result of chance or cherry-picking, but a surprising number of papers don’t bother to check such things correctly. Getting people outside the, shall we say, more ‘excitable’ parts of the blogosphere to pay any attention is, unfortunately, a lot harder.

Can GCMs predict the temperature and precipitation for my home? No. There are often large variation in the temperature and precipitation statistics over short distances because the local climatic characteristics are affected by the local geography. The GCMs are designed to describe the most important large-scale features of the climate, such as the energy flow, the circulation, and the temperature in a grid-box volume (through physical laws of thermodynamics, the dynamics, and the ideal gas laws). A typical grid-box may have a horizontal area of ~100×100 km2, but the size has tended to reduce over the years as computers have increased in speed. The shape of the landscape (the details of mountains, coastline etc.) used in the models reflect the spatial resolution, hence the model will not have sufficient detail to describe local climate variation associated with local geographical features (e.g. mountains, valleys, lakes, etc.). However, it is possible to use a GCM to derive some information about the local climate through downscaling, as it is affected by both the local geography (a more or less given constant) as well as the large-scale atmospheric conditions. The results derived through downscaling can then be compared with local climate variables, and can be used for further (and more severe) assessments of the combination model-downscaling technique. This is however still an experimental technique.