Do ‘propagation of error’ calculations invalidate climate model projections of global warming?

My thoughts on claims made by Dr. Patrick Frank (SLAC) on the validity of climate model projections of global warming:

 

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

36 Responses to Do ‘propagation of error’ calculations invalidate climate model projections of global warming?

  1. Pat Frank says:

    This is a response to Dr. Patrick Brown and his critical video. It consists of multiple parts, and is pretty long. So, I’ll post each section separately, so as to make reading and response easier.

    Before proceeding, I’d like to thank Dr. Brown for kindly notifying me of his critique after posting it. His email was very polite and temperate; qualities that were very much appreciated. His video critique is thoughtful, very reasoned, and very clear and calm in presentation. Dr. Brown gave an accurate summary of my method. I also gratefully acknowledge Dr. Brown’s scientific integrity, very apparent in his presentation and especially in his deportment.

    I also acknowledge that, in the first several minutes of his presentation, Dr. Brown correctly described the error propagation method I used.

    I’ll begin by noting that my presentation shows beyond doubt that GCM global air temperature projections are no more than linear extrapolations of green house gas forcing. Linear propagation of error is therefore directly warranted.

    GCMs make large thermal errors. Propagation of these errors through a global air temperature projection will inevitably produce large uncertainty bars.

    Even a uncertainty of ±1 W/m² in tropospheric thermal energy flux will propagate out to an uncertainty of ±4.3 C after 100 years, which is about the same size as the ~4 C mean 2000-2100 anomaly from RCP 8.5, and about 4 times the projection uncertainty admitted by the IPCC.

    Before proceeding to specific points, I’ll mention that in minute 12:35, Dr. Brown observed that the ±17 C uncertainty envelope in RCP 8.5, derived from long wave cloud forcing (LCF) error is, “a completely unphysical range of uncertainty, so it’s totally not plausible that temperature could decrease by 15 degrees as we’re increasing CO₂. And it’s implausible as well that temperature could increase by 17 decrees as we’re increasing CO₂ under the RCP 8.5 scenario. But as I understand it, this is the point Dr. Frank is trying to make.

    A temperature uncertainty statistic is not a physical temperature. Statistical uncertainties cannot be “unphysical” in the sense Dr. Brown implies. The large uncertainty bars do not indicate possible increases or decreases in air temperature. They indicate a state of knowledge. The uncertainty bars are an ignorance width. I made this very point in my DDP presentation, when the propagated uncertainty envelopes were first introduced.

    It is true that the very large uncertainty bars subsume any possible future air temperature excursion. This condition indicates that no future air temperature can falsify a climate model air temperature projection. No knowledge of future air temperature is contained in, or transmitted by, a climate model temperature expectation value.

    Dr. Brown continued, “So he’s essentially saying that when you properly account for the uncertainty in the climate model projections, the uncertainty becomes so large so quickly that you can’t actually draw any meaning from the projections that the climate models are making.” On this, we are agreed.

    The assessment below of Dr. Brown’s presentation is long. To accommodate readers who do not wish to read through it, here’s a summary. Dr. Brown has:

    • throughout mistaken the time-average statistic of a dynamical response error for a time-invariant error;
    • throughout mistaken theory-bias error for base-state error;
    • repeatedly and wrongly appended a plus/minus to a single-sign offset error, in effect creating a fictitious root-mean-square (rms) error;
    • repeatedly and improperly propagated the fictitious rms error to produce uncertainty envelopes with one fictitious wing;
    • apparently does not recognize that only a unique model expectation value qualifies as prediction in science.

    This list is not exhaustive, but in-and-of itself is sufficient to vitiate the analytical merit of Dr. Brown’s analysis, in its entirety.

    Now to specifics:

    Dr. Brown’s critique was presented under five headings:
    1. Arbitrary use of 1 year as the compounding time scale.
    2. Use of spatial root-mean-square instead of global mean net error.
    3. Use of error in one component of the energy budget rather than error in net imbalance.
    4. Use of a base state error rather than a response error.
    5. Reality check: Hansen (1988) projection.

    These are taken in turn in subsequent posts. I assume any readers are familiar with the contents of Dr. Brown’s video.

    Minute 15:07, 1. Arbitrary use of 1 year as the compounding time scale.

    From Lauer and Hamilton, page 3831: “A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (x_mod) and observed (x_obs) 20-yr means. These differences are then averaged over all N models in the CMIP3 or CMIP5 ensemble to calculate the multimodel ensemble mean bias delta_mm which is defined at each grid point as delta_mm = (1/N){sum_over[(x_mod)_i] – x_obs}, for all i= 1 to N.”

    Page 3831 “The CF [cloud forcing] is defined as the difference between ToA [top of the atmosphere] all-sky and clear-sky outgoing radiation in the solar spectral range (SCF [short-wave cloud forcing]) and in the thermal spectral range (LCF [long-wave cloud forcing]).”

    That is, the ±4 W/m² LCF root-mean-square-error (rmse) is the annual average CMIP5 thermal flux error. The choice of annual error compounding was therefore analytically based, not arbitrary.

    Further, the ±4 W/m² is not a time-invariant error, as Dr. Brown suggested, but rather a time-average error of climate model cloud dynamics. It says that CMIP5 models will average ±4 W/m² error in long-wave cloud forcing each year, every year, while simulating the evolution of the climate.

    Although Dr. Brown did not discuss it, part of my presentation showed that CMIP5 LCF error arises from a theory-bias error common to all tested models. A theory-bias error is an error in the physical theory deployed within the model. Theory-bias errors introduce systematic errors into individual model outputs, and continuing sequential errors into step-wise calculations.

    CMIP5 models introduce an annual average ±4 W/m² LCF error into the thermal flux within the simulated troposphere, continuously and progressively each year and every year in a climate projection.

    Next, Dr. Brown suggested that the annual average could be arbitrarily used for 20 years or for one second. It should now be obvious that he is mistaken. An annual average error can be applied only to a calculation of annual span.

    Dr. Brown’s alternative propagation in 20-year steps used the ±4 W/m² one-year rmse LCF error. A 20-year time step requires a 20-year uncertainty statistic.

    The CMIP5 ±4 W/m² annual average can be scaled back up to a 20-year average LCF rms uncertainty, “±u_20,” calculated as ±u_20 (W/m²) = sqrt[42*20] W/m² = ±17.9 W/m².

    Using Dr. Brown’s RCP 8.5 scenario as the example, the 2000-2019 change in GHG forcing is 0.89 W/m². The year 2000 base greenhouse gas (GHG) forcing is taken as the sum of the contributions from CO₂+N2O+CH4, and is 32.321 W/m², calculated from the equations in G. Myhre, et al., (1998) GRL 25(14), 2715-2718. GHG forcing have recently been updated, but the difference doesn’t impact the force of this demonstration.

    Starting from year 2000, and using the linear model, the uncertainty across a projection consisting of a single 20-year time-step is [(0.42*33.833*±17.9)/32.321] = ±7.9 C, where 33.833 C is the year 2000 net greenhouse temperature.

    In comparison, at year 2019, i.e., after 20 years, the annual-step RCP 8.5 ±4 W/m² annual average uncertainty compounds to ±7.6 C.

    Likewise, after a series of five 20-year time-steps, the propagated uncertainty at year 2100 is ±17.3 C.

    In comparison, the RCP 8.5 centennial uncertainty obtained propagating the annual ±4 W/m² over 100 yearly time steps from 2000 to 2099 is ±17.1 C.

    So, in both cases, the annually propagated uncertainties are effectively the same values as the propagated 20-year time-steps.

    This comparison shows that, correctly calculated, the final propagated uncertainty is negligibly dependent on time-step size.

    All of this demonstrates that Dr. Brown’s conclusion at the end of section 1 (minute 16:50), though true, is misguided and irrelevant to the propagated error analysis.

  2. Pat Frank says:

    Minute 17:10, 2. Use of spatial root-mean-square instead of global mean net error.

    In his analysis, Dr. Brown immediately and incorrectly characterized the CMIP5 ±4 W/m² annual average LCF rms error as a “base-state error.”

    However, the LCF rms error was derived from 20 years of simulated climate — the 1986-2005 global climate states. These model years were extracted from historical model runs starting from an 1850 base state.

    The actual “base-state” error would be the difference between the simulated and observed 1850 climate. However, the 1850 climate is nearly unknown. Therefore the true base-state error is unknowable.

    In contrast, the model ±4 W/m² LCF error represents the annual average dynamical misallocation of simulated tropospheric thermal energy flux, during the 20 years of simulation. It is not a base-state error.

    As a relevant aside, looking carefully at the scale-bar to the left of Dr. Brown’s graphic of LCF model error (minute 17:57), the errors vary in general between +10 W/m² and –10 W/m² across the entire globe, with a scatter of deeper excursions.

    With these ±10 W/m² errors in simulated tropospheric thermal flux, we are expected to credit that the models can resolve the effect of an annual GHG forcing perturbation of about 0.035 W/m²; a perturbation ±286 times smaller than the general levels of error in Dr. Brown’s graphic.

    Next, Dr. Brown says that by squaring the LCF error, one makes the error positive. This, he says, doesn’t make sense. However, that representation is incorrect. Squaring the error provides a positive variance. The uncertainty used is the square root of the error variance, which makes it “±,” i.e., plus/minus, not positive. This is not an “absolute value error,” as Dr. Brown represents.

    In minute 18:30, Dr. Brown compares the mean LCF of 28 models with observed cloud LCF, showing that they are similar. By inference, this model mean error is what Dr. Brown means by “net error.”

    However, taking a mean allows positive and negative errors to cancel. Considering only the mean hides the fact that models do in fact make both positive and negative errors in cloud forcing across the globe, as Dr. Brown’s prior graphic showed. These plus/minus errors indicate that the simulated climate state does not correspond to the physically correct climate state.

    In turn, this climate state error puts uncertainty into the simulated air temperature because the climate simulation that produced the temperature is physically incorrect. Therefore, focusing on the mean model LCF hides the physical error in the simulated climate state, and confers a false certainty on the simulated air temperature.

    The point is clearer when considering Dr. Brown’s minute 18:30 graphic. The 28 climate models shown there have differing LCF errors. Their simulated climate states not only do not represent the physically correct climate state, but their simulated states also are all differently incorrect.

    That is, these models not only simulate the climate state incorrectly, but they produce simulation errors that vary across the model set. Nevertheless, the models all adequately reproduce the 1850-to-present global air temperature trend.

    Temperature correspondence among the models means that the same air temperature can be produced by a wide variety of alternative and incorrectly simulated climate states. The question becomes, what certainty can reside in a simulated air temperature that is consistently produced by multiple climate states, all of which are not only physically incorrect, but also incorrect in different ways? Further, when it is known that climate states are simulated incorrectly, what certainty resides in the climate-state evolution in time?

    Taking the mean error hides the plus/minus errors that indicate the simulated climate states are physically incorrect. The approach Dr. Brown prefers confers an invalid certainty on model results.

    In minute 19:37, Dr. Brown then compared the FGOALS and GFDL climate models with widely differing mean LCF offset errors, of -9 W/m² or +0.1 W/m², respectively, and showed they produced hugely different uncertainty envelopes when propagated.

    Propagating these errors is a mistake, however, because they are single-sign single-model mean offsets. They are not the root-mean-square error of each single-model global LCF simulation (see below).

    Neither offset error is a plus/minus value. However, the right side of Dr. Brown’s graphic incorrectly represents them as “±.” Dr. Brown has incorrectly appended a “±” to these single-sign errors. The strictly positive GFDL error can produce only a small positive wing, while the FGOALS calculation is restricted to a large negative wing. That is, Dr. Brown’s double-winged uncertainty envelopes resulted from improperly appending a “±” to mean errors that are strictly positive or negative values.

    Thus, both uncertainty calculations are wrong because a single-model single-sign mean offset was wrongly entered into a propagation scheme requiring a plus/minus rms error.

    The global LCF error for a single model simulation is the rmse calculated from simulated minus observed LCF in the requisite unit-areas across the globe. Taking the root-mean-square of the individual errors produces the global mean single-model plus/minus LCF uncertainty. Propagation of the LCF “±” rmse then produces both positive and negative wings of the uncertainty envelope.

    In minute 20:06, Dr. Brown asked, “Does it make sense that two models that predict similar amounts of warming by 2100 would have uncertainty ranges that differ by orders of magnitude?”

    We’ve seen that Dr. Brown’s error ranges are wrongly calculated and pretty much meaningless.

    Further, the fact that two models deploying the same physics make such different mean LCF errors shows that large parameter disparities are hidden in the models. In order to produce the same air temperature even though the respective mean LCF errors are widely different, the two models must have different suites of offsetting internal errors. That is, Dr. Brown’s objection here actually confirms my analysis. A large uncertainty must attach to a consistent air temperature emergent from disparately incorrect models.

  3. Pat Frank says:

    Minute 20:30, 3. Use of error in one component of the energy budget rather than error in net imbalance.

    Dr. Brown’s argument here does not take cognizance of the difference between the so-called instantaneous response to a forcing change and the equilibrium response. My analysis concerns the instantaneous response to GHG forcing. The equilibrium response includes the oceans, which respond on a much longer time scale. So, inclusion of the ocean heat capacity in Dr. Brown’s argument is a non-sequitur with respect to my error analysis.

    Next, the choice of LCF rms error confines the uncertainty analysis to the tropospheric thermal energy flux, where GHG forcing makes its immediate impact on global air temperature. GHG forcing enters directly into the tropospheric thermal energy flux and becomes part of it. An uncertainty in tropospheric thermal energy flux imposes an uncertainty in the thermal impact of GHG forcing.

    The CMIP5 ±4 W/m² annual average LCF error is ±114 times larger than the annual average ca. 0.035 W/m² forcing increase CO₂ emissions introduce into the troposphere.

    Dr. Brown proposed that a model with perfect global net energy balance would produce no uncertainty envelope in an error-propagation. However, restricting the question to global net flux in a perfectly balanced model neglects the problem of correctly partitioning the available energy flux among and within the climate sub-states.

    A model with offsetting errors among short-wave cloud forcing, long-wave cloud forcing, albedo, aerosol forcing, etc., etc., can have perfect net energy balance all the while producing physically incorrect simulated climate states, because the available energy flux is misallocated among all the climate sub-states.

    The necessary consequence is very large uncertainty envelopes associated with the time-wise projection of any simulated observable, no matter that the total energy flux is in balance.

    Cognizance of these uncertainties requires a detailed accounting of the energy flux distribution within the climate. As noted above, the LCF error directly impacts the ability of models to resolve the very small additional forcing associated with GHG emissions.

    This remains true in any model with an overall zero error in net global energy balance, but with significant errors in partitioned energy-flux among climate sub-states. Presently, this caveat to Dr. Brown’s argument includes all climate models.

  4. Pat Frank says:

    Minute 23:50, 4. Use of a base state error rather than a response error.

    Dr. Brown’s opening statement suggests I used a base-state error rather than a response error. This claim was discussed under item 1, where it was noted that the LCF rms error is not a time-invariant error, as Dr. Brown suggested, but a time-average error.

    At the risk of being pedantic, but just to be absolutely clear, a time-invariant error is constant across all time. A time-average error is calculated from individual errors that may, and in this case do, vary across time. The time-average error derived from many models allows one to calculate a time-wise uncertainty that is representative of those models.

    This point was more extensively discussed under item 2 where it was noted that the model LCF error represents the model average of the dynamically misallocated simulated tropospheric thermal energy flux, not a base-state error.

    In pursuing this line, Dr. Brown introduced a simple physical model of the climate, and investigated what would happen with a 5% positive offset (base-state) error in terrestrial emissivity in a temperature projection across 100 years, using that model.

    However, a model positive offset error is not a correct analogy to global LCF rmse error. Correctly analogized to LCF rmse, Dr. Brown’s simple climate model should suffer from a rmse uncertainty of ±5% in terrestrial emissivity. Clearly a rmse uncertainty is not a constant offset error.

    The positive offset error Dr. Brown invoked here represents the same mistaken notion as was noted under item 2, where Dr. Brown incorrectly used a strictly single-sign single-model mean LCF offset error rather than, properly, the single-model global LCF rms error.

    In minute 26:55, Dr. Brown again improperly attached a “±” onto his strictly positive +5% emission offset error. This mistake allowed him to introduce the plus/minus uncertainty envelope said to represent the uncertainty calculated using the linear error model.

    However, the negative wing of Dr. Brown’s uncertainty envelope is entirely fictitious. Likewise, as noted previously, a single-sign offset error cannot be validly propagated.

    Next, when Dr. Brown’s model is correctly analogized, the ±5% emissivity error builds an uncertainty into the structure of the model. The emissivity of the base state has a ±5% uncertainty and so does the emissivity of the succeeding simulated climate states, because the ±5% uncertainty in emissivity is part of the model itself. The model propagates this error into everything it is used to calculate.

    Correctly calculated, the base-state temperature suffers from an uncertainty imposed by the ±5% uncertainty in emissivity. The correct representation of base-state temperature is 288(+3.7/-3.5) C.

    The model itself then imposes this uncertainty on the temperature of every subsequent simulated climate state in a step-wise projection.

    The temperature of every simulation step “n-1” used to calculate the temperature of step “n” carries its “n-1” plus/minus temperature uncertainty with it. The temperature of simulated state “n” then suffers its own uncertainty because it was calculated with the model having the structural ±5% uncertainty in emissivity built into it. The total uncertainty of temperature “n” combines with the ±T uncertainty of step “n-1.”

    These successive uncertainties combine as the root-sum-square (rss) in a temperature projection.

    To show the effect of a ±5% uncertainty in emissivity, I duplicated Dr. Brown’s initial 100-year temperature calculation and achieved the same result, 288.04 C → 291.93 C after 100 years. I then calculated the temperature uncertainties resulting from a ±5% uncertainty in the value of the changing emissivity, as it step-wise reduced by 5% across 100 years. The rss error was then calculated for each step.

    The result is that the initial 288(+3.7/-3.5) C became 289.95(+26.6/-25.0) C in the 50th simulation year, and 291.93(+37.6/-35.3) C in the 100th.

    So, properly analogized and properly assessed, Dr. Brown’s model verifies the method and results of my original climate model error propagation.

    Next, at minute 28:00, Dr. Brown showed that there is no relationship between model base-state error in global average air temperature and model equilibrium climate sensitivity (ECS). However, the Figure 9.42(a) he displayed merely shows the behavior of climate model simulations with respect to themselves. This is a measure of model precision. Figure 9.42(a) does not show the physical accuracy of the models, i.e., how well they represent the physically true climate.

    The fact that Figure 9.42(a) says nothing about physical accuracy, means it also can say nothing about whether any actual systematic physical error leaks from a base-state simulation into projected states. There is no measure of physical error in Figure 9.42(a).

    Figure 9.42(a) has another message, however. It shows that climate models deploying the same physical theory produce highly variable base-state temperatures and highly variable ECS values. This variability in model behavior demonstrates that the models are parameterized differently.

    Climate modelers choose each parameter to be somewhere within its known uncertainty range. The high variability evident in Figure 9.42(a) shows that these uncertainty ranges are very significant. These parameter uncertainties must impose an uncertainty on any calculated air temperature. Indeed, there must be a large uncertainty in the air temperatures displayed in Figure 9.42(a). However, none of the points sport any uncertainty bars. For the same reason of hidden parameter uncertainties, the ECS values must be similarly uncertain, but there are no ECS uncertainty bars, either.

    In standard physical science, parameter uncertainties are propagated through a calculation to indicate the reliability of a result. In consensus climate modeling, this is never done.

    The parameter sets within climate models are typically tuned using known observables, such as the ToA flux, so as to generate parameter values that provide a reasonable base-state climate simulation and to project a reasonable facsimile of known climate observables over a validation time-range. However, tuned models are not known to accurately reproduce the physics of the true climate. Tuning a model parameter set to get a reasonable correspondence merely hides the uncertainty intrinsic to a simulation; an uncertainty that is obviously present when regarding Figure 9.42(a).

    Next, Dr. Brown’s height-weight example is again an incorrect analogy because it is an empirical correlation within a non-causal epidemiological model, whereas a climate model is causal and deploys a physical theory. Dr. Brown’s comparison is categorically invalid.

    A proper comparison would involve using some causal physical model of the human body complete with genetic inputs and resource availability to predict a future height vs. weight curve of a population given certain sets of conditions. Elements of this model would have plus/minus uncertainties associated with them that introduce uncertainties into the output.

    Then, starting from year 2000, the calculation is made to predict the height vs. weight profile through to year 2100. The step-wise calculational uncertainties are propagated forward through the projection. The resulting uncertainty bars condition the prediction, and indicate its reliability.

    The height-weight example marks the third time in his analysis that Dr. Brown improperly misrepresented a constant offset error as a plus/minus uncertainty. He has again incorrectly appended a “±” to a positive-sign offset error. The negative wing of his calculated uncertainty envelope (minute 29:48) is again entirely fictitious.

    This example also again shows that Dr. Brown continued to mistake a theory-bias error, i.e., a plus/minus rmse uncertainty within the structure of a physical theory, for a single-value offset error as might be present in a single calculation. This mistaken notion ramifies through Dr. Brown’s entire analysis.

    Finally, this same mistake does similar violence to Dr. Brown’s step-size example in minute 30:30, where he, once again, mis-analogized theory-error as a base-state error.

    In his example, the correct analogy with rmse LCF error is a rmse plus/minus uncertainty in the size of each step.

    Dr. Brown correctly propagated the 2-feet uncertainty in step-size as the rss, the distance traveled after three steps, with its correct uncertainty of 15±3.46 feet.

    Dr. Brown’s 5-feet offset error only affects the uncertainty in the final distance from an initial reference point. It has nothing to do with an uncertainty in the distance traveled. It is not a correct analogy for the plus/minus LCF error statistic of climate models.

    So, Dr. Brown’s final statement in this section (minute 31:53), that, “[A] bias or error in the base state should not be treated as the same thing as an error in the response (or change),” is correct, but completely irrelevant to propagation of the plus/minus LCF error statistic. The statement only illustrates Dr. Brown’s invariably mistaken notion of the sort of error under examination.

    Again, the CMIP5 ±4 W/m² LCF error is not a constant, single-event base-state error, nor an offset error, nor a time-invariant error. The CMIP5 ±4 W/m² LCF error is a time-average error that arises from, and is representative of, the dynamical errors produced by climate models deploying an incorrect physical theory. It appears in every single step of a climate simulation and propagates forward through a time-wise projection.

  5. Pat Frank says:

    Minute 32:14, 5. Reality check: Hansen (1988) projection.

    Dr. Brown proposed a reality check, which was to plot the observed temperature trend over the Hansen, 1988 Model II scenario projections, shown in minute 34:02.

    Dr. Brown’s mistake here is subtle but critically central. He is treating Hansen scenario B as a unique result; as though there were no other temperature projection possible, under the scenario GHG forcings.

    Before getting to that, however, look carefully at Dr. Brown’s red overlay of observed temperatures. The ascent from scenario C to scenario B is due to the recent El Niño, which is presently in decline. Prior to 2015 – before this El Niño — the observed temperature trend matches scenario C quite well, but does not match scenario B.

    According to NASA, air temperatures are now “returning to normal” after El Nino 2016. The current air temperature trend shown at Carbon Brief illustrates this decline back to the pre-existing, non-scenario B, state.

    So, it appears that Dr. Brown’s model-observation correspondence claim rests upon a convenient transient.

    Now back to the point concerning the absolutely critical need for unique results in the physical sciences. Unique results from theory are central to empirical test by falsification. Only unique results are testable against experiment or observation. If a physical model has so many internal uncertainties so as to produce a wide spray of outputs (expectation values) for the same set of inputs, that model cannot be falsified by any accurate single-valued observation. Such a model does not produce predictions in a scientific sense because even if one of its outputs corresponds to observations, a correspondence between the state of the model and physical reality cannot be inferred.

    The discussion around Figure 9.42(a) above shows that the physics within climate models includes significantly large uncertainties. The models do not, and can not, produce unique results. Their projections are not predictions, and the internal state of the model does not imply the state of the physically real climate.

    I discussed this point in detail in terms of “perturbed physics” tests of climate model projections, in a post at Anthony Watts’ Watts Up With That (WUWT) blog, here. Interested readers should refer to Figures 1 and 2, and the associated text, in that post.

    The WUWT discussion featured the HADCM3L climate model. When model parameters are varied, the HADCM3L produces a large range of air temperature projections for the identical set of forcings. This result demonstrates the HADCM3L cannot produce a unique solution to the climate energy state. Nor can any other advanced climate model.

    From the post, “No set of model parameters is known to be any more valid than any other set of model parameters. No projection is known to be any more physically correct (or incorrect) than any other projection.

    “This means, for any given projection, the internal state of the model is not known to reveal anything about the underlying physical state of the true terrestrial climate.

    The same is true of Dr. Hansen’s 1988 projection. Variation of its parameters within their known range of uncertainties would have produced a large number of alternative air temperature trends. The displayed scenario B is just one of them, and is not unique to its set of forcings. Scenario B is not a prediction, and it is not validated as physically correct, merely because it happens to approximate the observed air temperature trend.

    In his 2005 essay, “Michael Chrichton’s “Scientific Method,”” Dr. Hansen himself wrote that the agreement between his scenario B and observed air temperature is fortuitous, in part because the Model II ECS was too large and also because of “other uncertain factors.” Dr. Hansen’s modestly described, “other uncertain factors,” are likely to be the large parameter uncertainties and the errors in the physical theory, as discussed above. Dr. Hansen’s 2005 article is available here: http://www.columbia.edu/~jeh1/2005/Crichton_20050927.pdf (106 kB).

    Fortuitousness of agreement does not lend itself to Dr. Brown’s claim of predictive validity.

    Dr. Hansen went on to say about his 1988 scenario B that, “it is becoming clear that our prediction was in the right ballpark”, showing that he, too, apparently does not understand the critical requirement – indeed the sine qua non — of a unique result to qualify a calculation from theory as a scientific prediction.

    Similar criticism applies to Dr. Brown’s Figure at minute 34:52, “Modeled and Observed Global Mean Surface Temperature.” The air temperature uncertainty envelope is merely the standard deviation of the CMIP5 model projections around the ensemble model mean. This is a measure of model precision, and indicates nothing about the physical accuracy of the mean projection.

    The models have all been tuned to produce alternative suites of parameters that permit a reasonable-seeming projection. The HADCM3L example illustrates that under conditions of perturbed physics, each of those models would produce a range of projections with a spread much larger than Dr. Brown’s Figure admits, all with the identical set of forcings.

    Neither the mean projection, nor any of the individual model projections represent a unique result. Tuning the parameter sets and reporting just the one projection has merely hidden the large uncertainty inherent in each projection.

    The correct plus/minus uncertainty in the mean projection is the [rms/(n-1)] uncertainty calculated from the uncertainties in the individual projections, meaning that the occult uncertainty in the ensemble mean is larger than the occult uncertainty in each individual projection.

    Dr. Brown’s question at the end, “How long would observed temperature need to stay close to the climate model projections before we can say that climate models are giving us useful information about how temperature responds to greenhouse gas forcing?” is unfortunate.

    Models have been constructed to require the addition of greenhouse gas forcing in order to reproduce global air temperature. Then turning around and saying that models with greenhouse gas forcings produce temperature projections close to observed air temperatures, is to invoke a circular argument.

    Given the IPCC forcings, the linear model of my analysis reproduces the recent air temperature trend just as well as do the CMIP5 climate models. In the spirit of Dr. Brown’s question, we can just as legitimately ask, ‘How long would observed temperature need to stay close to the linear model projections before we can say that the linear model gives us useful information about how temperature responds to greenhouse gas forcing?’ The obvious answer is ‘forever,’ because the linear model will never ever give us such useful information.

    And now that we know about the uncertainties hidden within the CMIP5, and prior, climate models, we also know the same, ‘forever, never, ever,’ answer applies to them as well.

    We know the terrestrial climate has emerged from the Little Ice Age, and has been warming steadily since about 1850. Following Dr. Brown’s final question, even if the warming continues into the 21st century, and the projections of tuned, adjusted and tendentious (constructed to need the forcing from GHG emissions) climate models stay near that warming air temperature trend, the model projection uncertainties are so large and so and the expectation values are so non-unique, that any future correspondence cannot escape Dr. Hansen’s diagnosis of “fortuitous.”

    Summary conclusion: Not one of Dr. Brown’s objections survives critical examination.

  6. ptbrown31 says:

    Thanks for these replies! I will think about them and post my responses here within a few days.

  7. Maybe I can just ask Frank a simple question. If we were to take a climate model and rerun it with the solar forcing 4W/m^2 different to what is observed to be, does he think we should propagate this “error” in the same way as he’s suggesting that we propagate the cloud forcing error?

    • Pat Frank says:

      Answer: no.

      • Pat,
        As I understand it, that is essentially equivalent to the long wavelength cloud forcing being in error by 4 W/m^2. If you would propagate one, why not the other?

      • Pat Frank says:

        The offset error you propose is not a rms uncertainty obtained from a calibration experiment. The LCF error is ±4Wm⁻², not a positive offset.

      • Pat,
        I think you’re simply wrong. The 4W/m^2 error in cloud forcing is equivalent to an offset. It doesn’t represent the range in which it can vary from step to step, which is what you’re suggesting. As others have pointed out, if it did, the output from climate models would vary far more than it currently does. Therefore, your suggestion that there is a 4W/m^2 cloud forcing uncertainty at each step (or each year) is clearly wrong.

  8. I would like to thank Dr. Frank for engaging with my criticism of his method. His response to my video has caused me to think deeper about several issues being discussed. After considering Dr. Frank’s responses, however, I am still very much unconvinced that Dr. Frank’s method makes sense.

    I think this conversation has been successful in that both parties have been respectful and have not resorted to ad hominem attacks. Unfortunately, It looks like both parties have been unpersuaded to move very much from their initial positions. My hope is that this conversation has at least helped illuminate where precisely the disagreements lie.

    Below, I address the portions of Dr. Frank’s response that I feel are most relevant to the question at hand: whether or not his method of propagating errors makes sense. I try to avoid nitpicking or opening up tangential climate science debates (which I would be happy to have elsewhere).

    ———————-
    Section 1. Arbitrary use of 1 year as the compounding time scale.

    “From Lauer and Hamilton, page 3831: ‘A measure of the performance of the CMIP model ensemble in reproducing observed mean cloud properties is obtained by calculating the differences in modeled (x_mod) and observed (x_obs) 20-yr means’… That is, the ±4 W/m² LCF root-mean-square-error (rmse) is the annual average CMIP5 thermal flux error. The choice of annual error compounding was therefore analytically based, not arbitrary.…Next, Dr. Brown suggested that the annual average could be arbitrarily used for 20 years or for one second. It should now be obvious that he is mistaken. An annual average error can be applied only to a calculation of annual span.”

    Differencing two 20-year means does not produce an “annual error”. Even if the underlying temporal resolution of the data was annual it still wouldn’t be an annual error because we can arbitrarily scale up or scale down the temporal resolution. Consider the following example: Let’s say that over 10 years, we observe the following model vs. observation annual average differences (errors):

    Year1 = -1, Year2 = 2, Year3 = -3, Year4 = 2, Year5 = 4, Year6 = 2, Year7 = -6, Year8 = 5, Year9 = 3, Year10 = 2.

    If we average these numbers together we get 1. So we could say that the “annual average error” is 1. But the original data could have been presented as biennual average errors rather than annual average errors:

    Years1-2 = 0.5, Years3-4 = -0.5, Years5-6 = 3, Years7-8 = -0.5, Years9-10 = 2.5.

    Or the data could have been presented as quinquennial average errors:

    Years1-5 = 0.8, Years6-10 = 1.2

    If we average the series of biennual or quinquennial average errors together we will still get 1 in both cases. But in these cases, the apparent underlying temporal resolution is now 2 years or 5 years so we might be tempted to call the error the “biennual average error” or the “quinquennial average error” instead of the “annual average error”.

    I have contacted Axel Lauer of the cited paper (Lauer and Hamilton, 2013) to make sure I am correct on this point and he told me via email that “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.”.

    The point is that the ±4 W/m² root-mean-square error does not have an intrinsic annual timescale attached to it. Its units are W/m² not W/m²/year. Thus, the choice to compound annually is arbitrary.

    ———————-
    Section 2. Use of spatial root-mean-square instead of global mean net error.

    “In his analysis, Dr. Brown immediately and incorrectly characterized the CMIP5 ±4 W/m² annual average LCF rms error as a “base-state error.” However, the LCF rms error was derived from 20 years of simulated climate — the 1986-2005 global climate states. These model years were extracted from historical model runs starting from an 1850 base state…. In contrast, the model ±4 W/m² LCF error represents the annual average dynamical misallocation of simulated tropospheric thermal energy flux, during the 20 years of simulation. It is not a base-state error.”

    I used the term “base-state error” in my presentation to distinguish a climatological (time mean) error from a response error. I will acknowledge that this wording was unclear. However, if you replace the phrase “base-state error” in my presentation with “climatological mean error” or “annual average dynamical misallocation of simulated tropospheric thermal energy flux, during the 20 years of simulation” it wouldn’t change the point I was making.

    “With these ±10 W/m² errors in simulated tropospheric thermal flux, we are expected to credit that the models can resolve the effect of an annual GHG forcing perturbation of about 0.035 W/m²; a perturbation ±286 times smaller than the general levels of error in Dr. Brown’s graphic.”

    This comment gets at the heart of Dr. Frank and my disagreement. The units for the errors are indeed W/m² whereas the units for the annual GHG forcing are W/m²/year. One is a climatological mean flux error and one is a change in flux per unit time. Since they have different units, it is not appropriate to take their ratio and speak of one being X times smaller than the other.

    “Next, Dr. Brown says that by squaring the LCF error, one makes the error positive. This, he says, doesn’t make sense. However, that representation is incorrect. Squaring the error provides a positive variance. The uncertainty used is the square root of the error variance, which makes it “±,” i.e., plus/minus, not positive. This is not an “absolute value error,” as Dr. Brown represents.”

    The point I was trying to make was that the root-mean-square error does not allow for spatial cancellation. Dr. Frank’s method purports to calculate the uncertainty in global mean surface air temperature projections. Global mean surface air temperature error is related to global mean net flux error (which allows for cancellation) not to spatially calculated root-mean-square error.

    “…These plus/minus errors indicate that the simulated climate state does not correspond to the physically correct climate state. In turn, this climate state error puts uncertainty into the simulated air temperature because the climate simulation that produced the temperature is physically incorrect.”

    I am reminded of the phrase credited to George Box that “All models are wrong but some are useful”. Of course no model has a perfect correspondence to the true climate state. The question at hand is whether or not this lack of perfection allows us to disregard all global temperature projections as completely useless (Dr. Frank’s claim).

    “Neither offset error is a plus/minus value. However, the right side of Dr. Brown’s graphic incorrectly represents them as “±.” Dr. Brown has incorrectly appended a “±” to these single-sign errors. The strictly positive GFDL error can produce only a small positive wing, while the FGOALS calculation is restricted to a large negative wing.”

    While it is true that a root-mean-square error does not necessarily imply a climatological mean error, a climatological mean error does imply (and can be represented by) a root-mean-square error. All flux errors originally have only a single sign (either + or -). The “±” comes from taking the square root of the squared error (in the same way the square root of 25 is not 5, but ± 5). The root-mean-square of an error of -9 W/m² is:

    SQRT((-9 W/m²)²/1) = ±9 W/m².

    (Since this is a global mean net longwave cloud forcing error, n in the root-mean-square error formula is just 1.).

    So the “±” comes about from adhering to Dr. Frank’s method of using root-mean-square error and thus it has not been incorrectly appended.

    ———————-
    Section 3. Use of error in one component of the energy budget rather than error in net imbalance.

    “Dr. Brown proposed that a model with perfect global net energy balance would produce no uncertainty envelope in an error-propagation. However, restricting the question to global net flux in a perfectly balanced model neglects the problem of correctly partitioning the available energy flux among and within the climate sub-states.”

    I agree that we desire models that correctly partition the energy flux into its various components. However, when the topic of interest is global mean surface air temperature, it is the global mean net flux that is most relevant, not how that flux is distributed among its components.

    ———————-
    Section 4. Use of a base state error rather than a response error.

    “However, a model positive offset error is not a correct analogy to global LCF rmse error. Correctly analogized to LCF rmse, Dr. Brown’s simple climate model should suffer from a rmse uncertainty of ±5% in terrestrial emissivity. Clearly a rmse uncertainty is not a constant offset error. The positive offset error Dr. Brown invoked here represents the same mistaken notion as was noted under item 2, where Dr. Brown incorrectly used a strictly single-sign single-model mean LCF offset error rather than, properly, the single-model global LCF rms error. Dr. Brown again improperly attached a “±” onto his strictly positive +5% emission offset error. This mistake allowed him to introduce the plus/minus uncertainty envelope said to represent the uncertainty calculated using the linear error model.”

    As I say above, the “±” part of the ±5% emissivity error came from taking the square root of the squared error. Thus, the root-mean-square of a +5% error is:

    SQRT((+5%)²/1) = ±5%.

    Again, the “±” comes about from adhering to Dr. Frank’s method of using root-mean-square error. Thus, the “±” is not improperly attached to the 5% emissivity offset error.

    “a single-sign offset error cannot be validly propagated.”

    This is an interesting statement. Imagine there are two models, Model A and Model B.

    Model A has a climatological longwave cloud forcing error of +10 W/m² at every location on earth.

    Model B has a climatological longwave cloud forcing error of +10 W/m² over half of the earth and -10 W/m² over the other half.

    In both models, the root-mean-squared error is ± 10 W/m². Dr. Frank seems to be saying that he would not even be able to apply his method to Model A. Is it a requirement that models have offsetting (canceling) spatial errors in order for the error to be propagated? This would seem like an odd requirement.

    “…So, properly analogized and properly assessed, Dr. Brown’s model verifies the method and results of my original climate model error propagation.”

    I disagree. The whole point of this demonstration was to show that, despite what Dr. Frank’s method suggests, uncertainty in the climatological mean emissivity did not actually propagate into uncertainty in the change in temperature over time.

    “Figure 9.42(a) has another message, however. It shows that climate models deploying the same physical theory produce highly variable base-state temperatures and highly variable ECS values. This variability in model behavior demonstrates that the models are parameterized differently. Climate modelers choose each parameter to be somewhere within its known uncertainty range. The high variability evident in Figure 9.42(a) shows that these uncertainty ranges are very significant. These parameter uncertainties must impose an uncertainty on any calculated air temperature.”

    I’ll just point out that I agree with all of this.

    “Next, Dr. Brown’s height-weight example is again an incorrect analogy because it is an empirical correlation within a non-causal epidemiological model, whereas a climate model is causal and deploys a physical theory. Dr. Brown’s comparison is categorically invalid. A proper comparison would involve using some causal physical model of the human body complete with genetic inputs and resource availability to predict a future height vs. weight curve of a population given certain sets of conditions. Elements of this model would have plus/minus uncertainties associated with them that introduce uncertainties into the output.”

    It is reasonable to point out the difference between a physical and statistical model. But for the sake of argument, let’s imagine that the model I was describing was indeed a causal physical model of the human body. Would it make sense that an initial error (or uncertainty) in height (which could be characterized as a ± root-mean-square error) would indicate that the model has absolutely nothing to say about the relationship between age and height?

    “This example also again shows that Dr. Brown continued to mistake a theory-bias error, i.e., a plus/minus rmse uncertainty within the structure of a physical theory, for a single-value offset error as might be present in a single calculation. This mistaken notion ramifies through Dr. Brown’s entire analysis.”

    I will say again that an offset error can be represented as a ± root-mean-square error. It seems that Dr. Frank is suggesting that there would be no problem with a model that has a uniform sign offset error (like my hypothetical Model A described above) because a uniform sign error would somehow not count as a “theory-bias error”. This makes no sense to me.

    “Dr. Brown’s 5-feet offset error only affects the uncertainty in the final distance from an initial reference point. It has nothing to do with an uncertainty in the distance traveled…”

    I agree!

    “…It is not a correct analogy for the plus/minus LCF error statistic of climate models.”

    I disagree.

    “The CMIP5 ±4 W/m² LCF error is a time-average error that arises from, and is representative of, the dynamical errors produced by climate models deploying an incorrect physical theory. It appears in every single step of a climate simulation and propagates forward through a time-wise projection.”

    I agree that the ±4 W/m² LCF error “appears in every single step” but that does not imply that it propagates and compounds in every single step. This is a non-sequitur that amounts to changing the error’s units from ±4 W/m² to ±4 W/m²/year.

    ———————-
    Section 5. Reality check: Hansen (1988) projection.

    “Similar criticism applies to Dr. Brown’s Figure at minute 34:52, “Modeled and Observed Global Mean Surface Temperature.” The air temperature uncertainty envelope is merely the standard deviation of the CMIP5 model projections around the ensemble model mean. This is a measure of model precision, and indicates nothing about the physical accuracy of the mean projection.”

    It is not the uncertainty envelope but the relative agreement with observations that tells us something about the physical accuracy of the mean projection. If climate models gave us zero information on the temperature response to CO2 then we would expect larger disagreements between models and observations.

    “Models have been constructed to require the addition of greenhouse gas forcing in order to reproduce global air temperature. Then turning around and saying that models with greenhouse gas forcings produce temperature projections close to observed air temperatures, is to invoke a circular argument.”

    This is a strange statement. Is it Dr. Frank’s contention that the radiative forcing from greenhouse gasses is something that could have legitimately been left out of a quantitative simulation of climate? The global temperature response to greenhouse gasses is an emergent property of these physical models. It is not some simple arbitrary relationship programmed into the models.

    I’ll note that Dr. Frank brings up plenty of other interesting philosophical points in his section 5 response but discussing them here would distract from the main topic of whether or not his method’s uncertainty quantification makes sense.

    ———————-
    Summary

    I am not persuaded that any of my primary objections to Dr. Frank’s method are misguided and I would still argue strongly that:

    1) The ±4 W/m² climatological root-mean-square longwave cloud forcing error is not intrinsically tied to the annual time scale.

    2) If the goal is to assess uncertainty in global mean surface air temperature, global mean net longwave cloud forcing error is more relevant than the spatial root-mean-square longwave cloud forcing error.

    3) If the goal is to assess uncertainty in global mean surface air temperature change, the net energy budget error is much more relevant than the climatological mean longwave cloud forcing error.

    4) Most fundamentally: A climatological (time-mean) longwave cloud forcing error of ±4 W/m² is not the same thing as a ±4 W/m²/year compounding error in the response of the climate system to forcing.

    5) If climate models had zero skill in projecting global mean surface air temperature then we would expect a larger divergence between model projections and observations than we see in reality.

    Unfortunately, it looks as though Dr. Frank and I are at somewhat of an impasse and thus I think it will have to be left up to the interested reader (if there are any) to decide who is correct.

    • Pat Frank says:

      Dr. Brown, I just found a few minutes ago that you had replied, after checking at “…and Then There’s Physics.”

      I’ll be going through your assessment, but did notice immediately that your first example concerning annual uncertainty is incorrect.

      The Lauer and Hamilton annual LCF uncertainty is the root-mean-square of the differences, not the linear average that you have. The correct analogy is the square root of the mean square of your differences, i.e., sqrt[sum over(delta_i)^2]/(N)], which in your case is ±3.35.

      Note that the plus/minus of the Lauer and Hamilton LCF error emerges directly using the correct analogy. The fact that the plus/minus is missing from your result should have indicated immediately that you’d used the wrong error model.

      Your biennial average is also calculated incorrectly. The mean uncertainty from each set of two years should be the root mean square, not the linear average. For example your years 1-2 mean error should be sqrt[(1^2+2^2)/2] = ±1.6, not 0.5, and the final uncertainty over the ten years taken five biennial steps is again ±3.35.

      This is all standard error analysis.

      I’ll point out again that error cancellation by combining positive and negative errors does not remove uncertainty, because the underlying physics is not improved. Cancelling errors only hides the uncertainty in the result.

      • It is not incorrect because I was not claiming that example was a root-mean-square error. I was simply illustrating that the mean error does not have an intrinsic timescale. If you would like this example to more closely resemble the Lauer and Hamilton calculation consider this:

        Ann vs Bienn RMSE

        This calculation shows that you get the same spatial RMSE whether or not the underlying temporal resolution of the data is annual or biennial. Therefore you would be just as justified calling the calculated RMSE the “biennial average error” as you would be calling it the “annual average error”. In reality, there is no intrinsic timescale attached to this RMSE.

      • Pat Frank says:

        Your presentation here confuses error and uncertainty.

        In the sense of the LCF rms error, the numbers you’re supplying should be seen as model calibration experiments, relative to known observations.

        The methodological or model errors revealed in calibration experiments are never, ever, linearly combined nor ever linearly averaged.

        As with Lauer and Hamilton, the model calibration errors would be combined as their rms. That puts them in a form useful to judge the reliability of, or uncertainty associated with, predictions of future states.

        E.g., for Location 1, rms error = sqrt{[(1/year)² + (3/year)² + … +(9/year)²]/6} = ±e/year

        rmse error Location 1: ±6.12/year
        rmse error Location 2: ±4.97/year
        rmse error Location 3: ±2.92/year

        Overall mean rmse: (+/-)4.85/year

        In your biennial example, each set of two years must be combined as their rms. The rms error again carries the dimension, per year.

        The three sets of two year rms errors are again combined as their rms. The 6-year rms error carries dimension per year.

        Combined into a total rms error, the 18 years combined error is of dimension per year.

        Location 1 2 3
        Year 1-2 ±2.23/year ±5.52/year ±3.0/year
        Year 3-4 ±7.0/year ±1.0/year ±1.0/year
        Year 5-6 ±7.65/year ±6.52/year ±3.81/year

        Mean rmse ±6.12/year ±4.96/year ±2.86/year

        Overall rmse: ±4.84/year

        The rms error is the same in each case when the uncertainties are computed correctly. In each case, the rmse has dimension per year.

  9. JCH says:

    Very good read.

    If climate models gave us zero information on the temperature response to CO2 then we would expect larger disagreements between models and observations.

    Boy howdy.

    The Ozone Man basically presented James E. Hansen. He was laughed at… made fun of by the President of the United States. At the time everybody had a good time making fun of the Ozone Man. Since then sea level has shot up; vast amounts of ice have melted; the GMST is now close to Hansen’s projection; onsets of seasons are moving; you do not have to bury your water pipes as deep as you once did; etc. Nobody, especially a clown like the Ozone Man, gets that freakin’ lucky.

  10. GISS Model II, used by Hansen in the 1980s, lives on as EdGCM. Written in the 1980s, it’s difficult to understand how the code writers could know the evolution of global surface temperatures over the ensuing years. I.e., in 1980 it would have been very difficult to write a program that mimicked global SSTs through 2015 since global SSTs through 2015 were then an unknown.

    Yet if one looks at the output of EdGCM with and without an increase in CO2 forcing the differences are both stark and informative.

  11. The proof is in the pudding. If climate models can make accurate predictions, then they are not useless. Here is a list of 17 accurate predictions made by climate models:

    http://bartonlevenson.com/ModelsReliable.html

    The thesis that climate models are useless is thus easily falsified. Q.E.D.

  12. Pat Frank says:

    I thank Dr. brown for his outstandingly thoughtful, civil, and temperate response to my 29 January assessment of his video and throughout this debate. This essay responds to Dr. Brown’s 1 February commentary.

    Summary conclusion: Dr. Brown’s assessment does not withstand critical scrutiny.
    • The LCF error is dimensionally derived and is Wm⁻² year⁻¹ (grid-point)⁻¹;
    • Throughout, physical error is mistaken for uncertainty;
    • The “base-state error” analogy remains incorrect;
    • Accidental agreement does not indicate accuracy.

    Dr. Brown’s most recent points are considered in turn below.

    Point 1: Dr. Brown’s claim that the choice of 1-year propagation steps is arbitrary.

    Let’s work through the method of Lauer and Hamilton. [1]

    They interpolated cloud properties onto a 1° x 1° grid to allow uniform cross-comparisons. Observed properties were averaged across 20 years. Likewise, for each model, simulated cloud properties were averaged across 20 years at each grid-point. They calculated the 20-year model grid-point means, subtracted each model mean from the observational mean to produce the global grid-point error as the difference between model means and observational means. The root-mean-square of the differences produced ±4 W/m² LCF error. I show below, this error is dimensioned year⁻¹.

    Let “O.n.n” = Observed.year_number.grid_point_register. Then the observed 20-year set of yearly cloud properties at each 1° × 1° grid-point (x) are averaged to give a 20-year mean of cloud properties at each grid-point as follows:

    O.1.1, O.1.2 … O.1.x … O.1.n
    +

    +
    O.20.1, O.20.2 …O.20.x … O.20.n

    The 20 annual series are summed and divided by 20 years to give:

    O.m.1, O.m.2 … O.m.x … O.m.n,

    where “m” is “mean”, representing the 20-year mean observed cloud properties at each of “n” 1° × 1° grid-points. With “20 years” in the divisor, the dimension attached to the mean observed is, ‘average-observed year⁻¹ (grid-point)⁻¹.’

    The same is done for each of 27 CMIP5 models, yielding for each model, Mi, (i = 1-27):

    For the 20 years of CMIP5 model #1:
    M1.1.1, M1.1.2 … M1.1.x … M1.1.n
    +

    +
    M1.20.1, M1.20.2 …M1.20.x … M1.20.n

    The 20 years of model #1 simulated cloud properties are summed and divided by 20 years to give:

    M1.m.1, M1.m.2 … M1.m.x … M1.m.n,

    where again “m” is “mean”, representing the 20-year mean simulated cloud properties at each 1° × 1° grid-point. The dimension attached to the mean is, ‘average-simulated year⁻¹ (grid-point)⁻¹.’

    All the [O.m.n minus Mi.m.n (i=1→27)] differences are then calculated for each 1° × 1° grid-point, yielding 27 sets of 1° × 1° grid-point mean error values, ei.m.n, i=1-27. The dimension attached the error is, ‘mean-Mi-error year⁻¹ (grid-point)⁻¹.’

    The overall mean model error at each 1° × 1° grid-point from Lauer and Hamilton eqn. 1 is:

    Mean model error = (1/N)[(sum over i=1→N) of (Mi.m.n – O.m.n)] = (1/N)[(sum over i=1→N) of (ei.m.n)], = E.m.n,

    where N = 27, and E.m.n = mean model error at each of “n” 1° × 1° grid-points. The dimension of E is mean-model-error year⁻¹ (grid-point)⁻¹.

    The Lauer and Hamilton global mean long-wave cloud forcing (LCF) uncertainty (U) is the root-mean-square of 27 E.m.n model mean values, LCF error = sqrt[(sum over n = 1→N) of (E.m.n)²/N], = ±U, where N is the total number of 1° × 1° grid-points.

    The dimension of LCF uncertainty is ±U Wm⁻² year⁻¹ (grid-point)⁻¹.

    This means an average ±4 Wm⁻² year⁻¹ LCF uncertainty is uniformly associated with every 1° × 1° grid-point across the globe.

    Dr. Brown’s quote from Prof. Lauer does not refute this point. Here’s Prof. Lauer as provided: “The RMSE we calculated for the multi-model mean longwave cloud forcing in our 2013 paper is the RMSE of the average *geographical* pattern. This has nothing to do with an error estimate for the global mean value on a particular time scale.

    Prof Lauer is, of course, correct. The crux issue is that he referred the error to a “particular time scale.” The LCF error they calculated is an error estimate for the global mean simulated value on an averaged time-scale. The mean error is a representative average time-scale error, not a particular time-scale error.

    It should be clear to all that an average of errors says nothing about the magnitude of any particular error, and that an annual average says nothing about a particular year or a particular time-range.

    Following from the step-by-step dimensional analysis above, the Lauer and Hamilton LCF rmse dimension is ±4 Wm⁻² year⁻¹ (grid-point)⁻¹, and is representative of CMIP5 climate models. The geospatial element is included.

    Note the all-caps RMSE in Prof. Lauer’s comment. This emphasizes the plus/minus uncertainty operator missing from Dr. Brown’s error model.

    Next, the linear average that Dr. Brown calculated as his 10-year example has no bearing on the LCF error statistic. Chapter 3, page 48 of Bevington and Robinson summarizes the various ways that errors are combined into uncertainties. [2] All of those ways are combined as some form of root-square or root-mean-square (Ch. 1, p. 11). None of them involve taking linear averages of serial errors.

    Using Dr. Brown’s values, the correct analogy is the square root of the mean square of the differences, i.e., rmse = sqrt[sum over(delta_i)²]/(N)], which in that case is ±3.35.

    The plus/minus of the Lauer and Hamilton LCF rms error emerges directly using the correct analogy. The fact that the plus/minus is missing from Dr. Brown’s result should have indicated immediately that the wrong error model was used.

    Dr. Brown’s biennial average is also calculated incorrectly.

    The mean uncertainty from each of Dr. Brown’s sets of two years should be the root mean square error of the individual years, not their linear average. [2] For example the years 1-2 mean error should be sqrt[(1²+2²)/2] = ±1.6, not 0.5, and the final uncertainty over the ten years taken five biennial steps is again ±3.35.

    This is all standard error analysis.

    Dr. Brown’s analysis here reveals a mistake in thinking that perfuses his entire analysis. This is to mistake physical error for uncertainty. Error can be calculated for a model result only when the correct, physically true, value is known.

    The condition of knowledge is violated in any projection of future climate, for which factual knowledge is zero. Error cannot be calculated when the true value is unknown. Dr. Brown offered his known-value error-cancellation model as an analogy to an unknown-value uncertainty-propagation.

    The analogy is wrong.

    Propagation of known calibration error is the only way to assess the reliability of a prediction concerning a future state, for which true values are unknowable.

    Finally, the clear per-year dimension in the rmse LCF error shows that Dr. Brown’s conclusion, …

    The point is that the ±4 W/m² root-mean-square error does not have an intrinsic annual timescale attached to it.

    … is proven wrong. Likewise, Dr. Brown’s following, …

    Its units are W/m² not W/m²/year. Thus, the choice to compound annually is arbitrary,”

    … is also wrong.

    The annual propagation step is obviously not arbitrary. The (year)⁻¹ bound emerges directly from the calculation of an annual mean error. ‘Per year’ is the dimension of the yearly mean of multiple-year values, and is unarguably attached to the rms LCF error.

    I’ll point out again that error cancellation by combining positive and negative errors does not remove uncertainty, because the underlying physics is not improved.

    Predictive reliability is not indicated by fortuitous cancellations. Supposing so is to confuse physical accuracy with spurious agreement. Cancelling errors only hides the uncertainty in a result.

    All of Dr. Brown’s argument in Point 1 is now vacated.

    Point 2: Use of spatial root-mean-square instead of global mean net error.

    Dr. Brown wrote, “ I used the term “base-state error” in my presentation to distinguish a climatological (time mean) error from a response error. I will acknowledge that this wording was unclear.

    It was very clear from his presentation that Dr. Brown meant “base-state error,” to mean a one-time single-value offset error in initial conditions.

    In his talk, Dr. Brown (minute 18:12ff) says that, “global mean temperature is related to mean net error.” But to achieve this position, Dr. Brown first incorrectly equated a squared error to the absolute value of error. He said that squaring an error makes it positive.

    But squaring an error does no such thing. Squaring coverts error into an uncertainty variance. Equating uncertainty variance with an absolute valued error is the mistake that enables Dr. Brown’s entire subsequent analysis. Dr. Brown’s analysis is thus misguided from the start.

    Next, it’s not clear what Dr. Brown means by (minute 18:25), “Global mean temperature is related to global mean net error, not accumulated absolute error.”

    Global mean [air] temperature is related to global mean net tropospheric forcings plus feedbacks, not net error. Perhaps Dr. Brown meant that, ‘The error in global mean temperature is related to global mean net error, not accumulated absolute error.’

    We’ve already seen that the “accumulated absolute error” idea is wrong. Calculating physical error requires that one knows both the physically correct value and the predicted value. Model error is ‘predicted minus (known correct)’ over a calibration period. Propagating known calculational error as the root-sum-square (rss) is the only way to judge the reliability of a prediction.

    With respect to climate models, the propagation analysis concerns predicted future air temperatures. Physically correct future temperatures are unknown and unknowable. No error calculation is possible. This point vacates Dr. Brown’s ultimate claim that model error is the factor under consideration.

    The Lauer and Hamilton ±4Wm⁻² year⁻¹ (grid-point)⁻¹ is the tropospheric thermal uncertainty statistic produced by their model calibration experiment comparing simulated cloud properties to observed cloud properties.

    This uncertainty propagated forward through predicted air temperature ascertains the reliability of the predicted values. Propagated uncertainty reveals nothing whatever about the actual physical error in the future temperatures.

    Dr. Brown wrote that, “if you replace the phrase “base-state error” in my presentation with “climatological mean error” or “annual average dynamical misallocation of simulated tropospheric thermal energy flux, during the 20 years of simulation” it wouldn’t change the point I was making.

    Perhaps, but the point Dr. Brown was making was mistaken from the start. LCF rmse is not a one-off constant offset error in initial conditions.

    Dr. Brown went on to write that, “This comment [comparing the annual 0.035 W/m² GHG forcing with the ±4 W/m²] gets at the heart of Dr. Frank and my disagreement. The units for the errors are indeed W/m² whereas the units for the annual GHG forcing are W/m²/year.

    In contrast, the derivation under Point 1 shows that the units for the LCF error statistic are Wm⁻² year⁻¹ (grid-point)⁻¹.

    Dr. Brown’s second argument in this section is now vacated.

    Dr. Brown wrote, “The point I was trying to make was that the root-mean-square error does not allow for spatial cancellation.

    The issue concerns uncertainty, not physical error. I have pointed out, above, that spatial cancellation of errors is not possible for projected temperatures. Nor does spatial error cancellation reduce the uncertainty in a simulation. Indeed, calculating uncertainty as the root-square explicitly recognizes and eliminates the mistake of imputing causal accuracy, falsely, following a fortuitous cancellation of error.

    Dr. Brown here again confused physical error with uncertainty. Error cannot be calculated for a futures projection in which the correct temperatures are unknown and unknowable. Minimizing net error by averaging positive and negative errors in a calibration experiment does not improve the accuracy of a causal calculation. Doing so merely hides the full measure of causal uncertainty.

    Continuing, “Dr. Frank’s method purports to calculate the uncertainty in global mean surface air temperature projections. Global mean surface air temperature error is related to global mean net flux error (which allows for cancellation) not to spatially calculated root-mean-square error.

    Dr. Brown here continued to confuse physical error and statistical uncertainty. Global mean surface air temperature projection uncertainty (what I calculate) not physical error (what captures Dr. Brown’s attention) is related to propagated rmse. Again, propagating rmse through a calculation is a standard of error analysis in the physical sciences, to evaluate the reliability of (uncertainty in) a result.

    The rest of Dr. Brown’s argument in this section merely tries to explain away the fact that he incorrectly appended a “±” to a single-sign error. He correctly observed that this procedure produces nonsense.

    However, the nonsense diagnosis then attaches to Dr. Brown’s analysis, not to mine. He appended the incorrect “±” sign to a single-sign error and then incorrectly analogized the result to the error propagation analysis.

    In my propagation analysis, however, the “±” came directly from the multi-model multi-year LCF rms error statistic in Lauer and Hamilton, which is not a single-sign error, but a root-mean-square error statistic, which is inherently “±.”

    All of Dr. Brown’s argument in Point 2 is now vacated.

    Point 3: Use of error in one component of the energy budget rather than error in net imbalance.

    Dr. Brown wrote, “I agree that we desire models that correctly partition the energy flux into its various components. However, when the topic of interest is global mean surface air temperature, it is the global mean net flux that is most relevant, not how that flux is distributed among its components.

    This is a remarkable statement. Dr. Brown has averred that an incorrect, erroneously simulated, tropospheric thermal energy flux has no impact of simulated air temperature, so long as the global mean energy flux is in balance at the top of the atmosphere (TOA).

    Dr. Brown here has directly implied the necessary existence of one and only one climate state, and one set of climate sub-states, and one air temperature, for each magnitude of total energy flux, so long as TOA flux-balance is maintained.

    The history of the Holocene terrestrial climate categorically refutes this idea. The solar energy flux input has been nearly constant across any given thousand-year range of the last several thousand years, and TOA flux balance has reigned throughout this period.

    Nevertheless, large variations in climate sub-states and in air temperature are evident and obvious within that time range. See, for example, the estimated air temperatures for the last 50,000 years, and the last 2000 years. Relatively large air temperature excursions are visible in virtually every thousand-year interval.

    We also know Dr. Browns claim is not true with respect to simulations by trivial inspection of the variability produced by multiple models under the same forcings, such as in Hargreaves and Annan, [3] or the effects of perturbed physics (parameter variation) on projected air temperatures produced by single climate models such as in Rowlands, et al. [4]

    Dr. Brown’s third argument can be set aside.

    Point 4: Use of a base state error rather than a response error.

    Dr. Brown wrote that, “As I say above, the “±” part of the ±5% emissivity error came from taking the square root of the squared error. Thus, the root-mean-square of a +5% error is:

    “SQRT((+5%)²/1) = ±5%.

    “Again, the “±” comes about from adhering to Dr. Frank’s method of using root-mean-square error. Thus, the “±” is not improperly attached to the 5% emissivity offset error.

    Dr. Brown has misstated the case. In the first part of Dr. Brown’s video assessment of his simple model, from minute 23:58 through to minute 26:39, the error in emissivity was a -5% constant offset error, followed by a linear 5% decrease in emissivity across 100 years. No sqrt[(5%)²] was in sight anywhere.

    Following Dr. Brown’s video argument all the way to the end (minute 28:45), the sqrt[(5%)²] calculation never appeared. He is right that sqrt[(5%)²] = ±5% but his presentation itself never showed that the ±5% was calculated. It just invokes the ±5% apparently out of thin air.

    Even worse, from minute 29:57, Dr. Brown said, “What Dr. Frank is essentially saying is that, because you have an error in the base-state, you now have no idea what the relationship is between height and age.

    Dr. Brown here described the error he used as a base-state error. Dr. Brown’s base-state error was the constant offset -5% error. It was not ±5%. This is a clear though tacit admission that he appended a “±” to the constant offset -5% single-sign error to enable the propagation.

    Apart from all this, throughout his critique Dr. Brown mistakenly applied this constant “base-state error” idea to rms error, rendering all of it irrelevant. The rmse calculation represents the standard deviation around a set of values.

    Dr. Brown’s single-value “SQRT((+5%)²/1) = ±5%” supposes that a standard deviation can be calculated using a single error-value. This supposition is clearly wrong. The statistical definition of a rmse is sqrt[(sum of squares)/N]. A single value can provide no sum; nor can it represent a standard deviation about a mean. Dr. Brown’s “SQRT” calculation is undefined and meaningless.

    As a relevant aside, in minute 29:46 discussing the age-height epidemiological model, Dr. Brown said, “So, if you use Dr. Frank’s method, though, and you propagate this base-state uncertainty throughout the calculation…

    But propagating a base-state error is not my method. The LCF rms error is not a base-state error. It represents a mean theory-bias systematic error present within CMIP5 climate models, emergent in every single simulation step. I discussed this point extensively under Section 2 and Section 4 of my previous response.

    In assessing my point that a single-sign offset error cannot be validly propagated, Dr. Brown wrote that,

    This is an interesting statement. Imagine there are two models, Model A and Model B.

    Model A has a climatological longwave cloud forcing error of +10 W/m² at every location on earth.

    Model B has a climatological longwave cloud forcing error of +10 W/m² over half of the earth and -10 W/m² over the other half.

    “In both models, the root-mean-squared error is ± 10 W/m². Dr. Frank seems to be saying that he would not even be able to apply his method to Model A. Is it a requirement that models have offsetting (canceling) spatial errors in order for the error to be propagated? This would seem like an odd requirement.

    One can discover the mistake in Dr. Brown’s assessment by asking how the global forcing errors of Model A and Model B would be derived. Following convention, the errors must be appraised at a set of grid-locations around the globe. The error, e, at each location, i, is model minus observed. The rmse global uncertainty is then sqrt[(sum over i=1→N) of (eᵢ)²/N], for both Model A and Model B. Both models produce an uncertainty of ±10W/m².

    The correct formulation departs immediately from Dr. Brown’s analysis, whose model yet again assumes uncertainty to be a linear average of error. This is never correct.

    A calculation of the uncertainty arising from a global mean of error will never produce a single-sign value. Appraisal of uncertainty does not involve a single-value error.

    But now to the heart of the matter: Model A and Model B are used to project future temperature. How does one calculate the physical error in a simulated future temperature? The answer is obvious: one does not. The reason is that no ‘model minus observed’ is available for an unknown and unknowable future physical temperature.

    All one can do to assess the reliability of a model futures projection is to propagate the calibration uncertainty.

    For example, with Model A, we know that the forcing error was +10 W/m² during the calibration period. How large is the forcing error in simulated future climate states? Is it +10 W/m²? How would anyone know? The physically true values of any future forcing are unknown and unknowable. How should one then calculate an error value?

    The physically true values of future forcings and feedbacks of a changed climate are also not known. Is it justifiable to merely assume +10 W/m² error? Clearly not, because the physically true state of the future climate is a matter of complete ignorance.

    This point of ignorance is where uncertainty enters. The true error cannot be known. But the ±10W/m² uncertainty of Model A or Model B can be propagated into the simulation of a future climate. The propagated error indicates the reliability of the simulation.

    As the simulation departs further and further into ignorance-land, the uncertainty in the projected climate state grows and the reliability of the simulation diminishes.

    Dr. Brown then disagreed that his base-state constant-offset model is not a correct analogy for the rms ±LCF error statistic of climate models.

    At this point, it should be obvious that Dr. Brown’s disagreement is misplaced.

    Dr. Brown’s final comment in this section, …

    I agree that the ±4 W/m² LCF error “appears in every single step” but that does not imply that it propagates and compounds in every single step. This is a non-sequitur that amounts to changing the error’s units from ±4 W/m² to ±4 W/m²/year.

    … was disproved under Point 1 above.

    All of Dr. Brown’s argument in Point 4 is now vacated.

    Point 5: Reality check: Hansen (1988) projection.

    Regarding the criticism of James Hansen’s 1988 Model II projections, and specifically the relevance of his scenario B result, Dr. Brown wrote, “ It is not the uncertainty envelope but the relative agreement with observations that tells us something about the physical accuracy of the mean projection.

    Dr. Brown here suggests that fortuitous correspondence (Hansen’s own words, [5]) is a measure of accuracy.

    The very large uncertainty attending Dr. Hansen’s Scenario B, due to LCF error alone, indicate the projection does not achieve the status of a prediction in science. It therefore offers no accuracy metric.

    Dr. Brown continued, “If climate models gave us zero information on the temperature response to CO2 then we would expect larger disagreements between models and observations.

    This view neglects the fact that climate models are tuned to observables. Tuning allows different models to have different suites of offsetting parameter errors. Even though they all are adjusted to reproduce target observables, their projections of future air temperatures vary widely. [6]

    For example, in Figure 7 of Chen and Frauenfeld, [7] the spread for year 2100 air temperature after 95 years (2005-2100) of projection from 20 CMIP5 models is 4 °C for RCP 8.5, 3 °C for RCP 4.5, and 2 °C for RCP 2.6. All 20 model-projections average within about 0.1 °C of the published air temperature trend across the entire 20th century, however, reflecting the prior tuning. This well illustrates the point: offsetting errors in the calibration period reflecting tuning, wide variation in the unconstrained projections (imperfectly) reflecting projection unreliability.

    These considerations render Dr. Brown’s point invalid.

    When fed with the IPCC 20th century forcings the linear emulator does a very good job of reproducing the published air temperature trend. No one would argue that this agreement makes the emulator an accurate model of the climate. But it meets Dr. Brown’s criterion.

    Finally, in response to my comment that, ““Models have been constructed to require the addition of greenhouse gas forcing in order to reproduce global air temperature. Then turning around and saying that models with greenhouse gas forcings produce temperature projections close to observed air temperatures, is to invoke a circular argument.

    Dr. Brown wrote in reply, “This is a strange statement. Is it Dr. Frank’s contention that the radiative forcing from greenhouse gasses is something that could have legitimately been left out of a quantitative simulation of climate? The global temperature response to greenhouse gasses is an emergent property of these physical models. It is not some simple arbitrary relationship programmed into the models.

    Dr. Brown’s critique would have force only if it were known that climate models deploy a complete theory of climate. Such a theory would describe exactly how climate responds to the radiative physics of GHG: how convection responds, how cloud properties and amounts respond, how precipitation responds, and so forth, all which to have been verified by direct comparison to accurate observations. However this has not been done, and indeed is not possible, and none of this physics is known to be properly represented in the models. Dr. Brown’s claim of a reliable emergence is a physics non-sequitur.

    This ends the reply. As before, not one of Dr. Brown’s objections withstands critical scrutiny.

    Citations:

    [1] Lauer, A. and K. Hamilton, Simulating Clouds with Global Climate Models: A Comparison of CMIP5 Results with CMIP3 and Satellite Data. J. Climate, 2013. 26(11): p. 3823-3845.

    [2] Bevington, P.R. and D.K. Robinson, Data Reduction and Error Analysis for the Physical Sciences. 3rd ed. 2003, Boston: McGraw-Hill.

    [3] Hargreaves, J.C. and J.D. Annan, Using ensemble prediction methods to examine regional climate variation under global warming scenarios. Ocean Modelling, 2006. 11(1-2): p. 174-192.

    [4] Rowlands, D.J., et al., Broad range of 2050 warming from an observationally constrained large climate model ensemble. Nature Geosci, 2012. 5(4): p. 256-260.

    [5] Hansen, J.E. (2005) Michael Crichton’s “Scientific Method”. URL: http://www.columbia.edu/~jeh1/2005/Crichton_20050927.pdf Date Accessed: 5 February 2017.

    [6] Kiehl, J.T., Twentieth century climate model response and climate sensitivity. Geophys. Res. Lett., 2007. 34(22): p. L22710.

    [7] Chen, L. and O.W. Frauenfeld, Surface Air Temperature Changes over the Twentieth and Twenty-First Centuries in China Simulated by 20 CMIP5 Models. Journal of Climate, 2014. 27(11): p. 3920-3937.

    • Hi Dr. Frank,

      I will be happy to reply more fully to your most recent comments, but before doing so, I think we really need to drill down on point number 1 because, as I see it, our entire disagreement rests critically on this issue.

      You say:

      “Lauer and Hamilton LCF rmse dimension is ±4 Wm⁻² year⁻¹ (grid-point)⁻¹, and is representative of CMIP5 climate models. The geospatial element is included.

      … Finally, the clear per-year dimension in the rmse LCF error shows that Dr. Brown’s conclusion, …“The point is that the ±4 W/m² root-mean-square error does not have an intrinsic annual timescale attached to it.”… is proven wrong. Likewise, Dr. Brown’s following, …“Its units are W/m² not W/m²/year. Thus, the choice to compound annually is arbitrary,”… is also wrong.”

      The annual propagation step is obviously not arbitrary. The (year)⁻¹ bound emerges directly from the calculation of an annual mean error. ‘Per year’ is the dimension of the yearly mean of multiple-year values, and is unarguably attached to the rms LCF error.

      I must insist that the unit is W/m² not W/m²/year.

      I tried to show this intuitively in the above excel screenshot which illustrates that the underlying temporal resolution of the data can be arbitrarily scaled up to any temporal resolution you want. Remember, the annual timescale is not the ‘native’ temporal resolution of either the climate model data or the observational data. As a thought experiment, imagine that we lived in a world where the convention was to archive data at the 5-year (60 month) temporal resolution instead of the annual temporal resolution. In this world, the data that underlies the Lauer and Hamilton ±4 W/m² value would be an average of 4 60-month-long segments rather than an average of 20 1-year-long segments. If you lived in this world, on what grounds would you argue that the unit for the ±4 value is W/m²/year rather than W/m²/60-months or W/m²/5-years? You wouldn’t have any grounds to argue that.

      That’s the intuition but what’s wrong with your unit accounting? Your mistake is that you are not applying the average formula correctly. For the time-mean that we are discussing, the unit ‘year’ actually appears in the numerator as well as the denominator of the average formula:

      sum( i = 1, n_years ;   year_i * value_i ) / sum (i = 1, n_years; year_i )

      Here is a more intuitive example using IQs instead of years (provided from my boss, Ken Caldeira):

      If it is average IQ of people you want to calculate, you would normally add the IQs and divide by number of people, but you really need to multiply each by 1 person in the numerator also.

      Imagine if you first clustered people by IQ, so you had 3 people with a 90 IQ , 5 with 100 IQ, and 4 with 110 IQ. The mean would be:

      ((3 people * 90 IQ) + (5 people * 100 IQ) + (4 people * 110 IQ)) / (12 people) = 100.83 IQ

      If you didn’t cluster people first, you still would need to write something like:

      ((1 people * 90 IQ) + (1 people * 90 IQ) + (1 people * 90 IQ) + (1 people * 100 IQ) + … ) / (12 people) = 100.83 IQ

      The unit for this average is actually just IQ not IQ/person.

      When it is just 1 value for each thing we are counting we normally get sloppy and don’t think about the unit multiplication in the numerator, but when we are formal, it is there.

      “… All of Dr. Brown’s argument in Point 1 is now vacated.”

      No, it is not.

      • Hi Dr. Frank,

        Here is one more scenario for you. This is the simplest situation that I can think of that gets at the crux of the W/m² vs. W/m²/year issue. I would appreciate an answer to this question before moving on.

        Imagine you are driving down the highway for a total of 3 hours or 180 minutes. For the first hour (60 minutes) you are driving 60 mph and for hours 2 through 3 (minutes 60-180) you are driving 90 mph. What is your average speed? Is it:

        a) 80 mph/hour
        b) 80 mph/minute
        c) 80 mph

        If you choose (a), please explain why you chose (a) instead of (b). If you choose (b), please explain why you chose (b) instead of (a).

        ANSWER:

        The correct answer is (c), which you can get from using either temporal resolution:

        Avg speed = ((1 hour)*(60 mph) + (2 hours)*(90 mph))/(1 hour + 2 hours) = 80 mph

        Avg speed = ((60 minutes)*(60 mph) + (120 minutes)*(90 mph))/(60 minutes + 120 minutes) = 80 mph

        The average speed does not have /hour OR /minute as the denominator because those units are implicitly in the numerator as well.

      • Yes, this mean minus mean yields mean error per year is the composition error that causes most of the problem.

        The mean error is the mean error – it is not mean error per year. Adding ‘per year’ implies that we have deduced a trend. We cannot deduce that trend by merely looking at the RMSE. To deduce a trend we have to compare the annual data – not the means.

        Here is a simple example of synthetic data:

        Set L is a linear function increasing in value with each year. Set R is a random selection from a normal distribution, and Set S is a sine function. Do the RMSEs in and of themselves tell us anything about these functions? No. Only by looking at the yearly data, not the means, are we be able to see this.

        If we extend these datasets to 100 or 1000 years Set R and Set S will not require the graph Y-axis be rescaled. The mean error is not accumulating with each timestep. Only Set L, where we have a known incremental error/year, has an accumulative error.

        So, if one wants to hypothesize that the error is accumulative one has to actually *show* that it is accumulative, but the accumulative hypothesis that Dr Frank has put forward has no supporting evidence.

      • Pat Frank says:

        oneillsinwisconsin wrote, “The mean error is the mean error – it is not mean error per year. Adding ‘per year’ implies that we have deduced a trend.

        You have 50 cents in one pocket and 82 cents in another. What is the mean cents per pocket? Is it (50 + 82)cents/2 pockets = 66 cents per pocket? Where is the trend?

        Is “per pocket” a direct result of dividing by 2 pockets?

        Honestly, you’re having trouble with 6th grade arithmetic.

        So, if one wants to hypothesize that the error is accumulative …

        We’re not discussing error. We’re discussing uncertainty. You’ve made a fatal mistake.

        Uncertainty propagates through a calculation, and it propagates through the steps of a step-wise calculation. It’s standard error analysis in the physical sciences.

  13. Dr Frank: ” .. the ±4 W/m² LCF root-mean-square-error (rmse) is the annual average CMIP5 thermal flux error..”
    Dr Brown: “Differencing two 20-year means does not produce an “annual error”

    Me: Of course differencing two 20-year means doesn’t produce and annual error. Is that what Dr Frank thinks? Reads Dr Frank’s comment at February 5, 2017 at 8:28 pm It appears so. Hmmm.. I won’t waste any more time on this.

    [crossposted at ATTP’s]

    • Pat Frank says:

      oneillsinwisconsin wrote, “Me: Of course differencing two 20-year means doesn’t produce and annual error.

      Sum of 20 years data = 200 W/m². Mean = 200 W/m²/20 years = 10 Wm⁻²year⁻¹
      Sum of 20 years of model sim: 300 W/m². Mean = 300 W/m²/20 years = 15 Wm⁻²year⁻¹

      Difference of means = error = 15 Wm⁻²year⁻¹ minus 10 Wm⁻²year⁻¹ = 5 Wm⁻²year⁻¹.

      • Dr Frank, in my synthetic data example above, the mean of all 4 sets is zero. You have not addressed it.

      • Pat Frank says:

        oneillsinwisconsin, in your example you apparently calculated the rms of the data points. That has no bearing on error. The fact that each of your data sets sums to zero has no bearing on error.

        Random error with a mean of zero and a constant standard deviation decreases as 1/sqrt(N). No one disputes that.

        A spuriously correct result stemming from adventitiously cancelling errors is not a demonstration of knowledge.

  14. Pat Frank says:

    Hi Dr. Brown,

    Before getting to your arguments, it is evident from our conversation that you have not been exposed to calibration, nor to the necessity and use of calibration experiments, nor to the propagation of calibration error through subsequent measurements and experiments.

    This is a very, very serious deficiency in your training. You have every right to be highly upset with the negligent people and programs responsible for this state.

    I have yet to encounter a climate modeler who understands physical error analysis. This consistent evidence makes it likely that climate modeling has sealed itself away from the verdict of experiment and observation, nullifying the threat of empirical disproof. This puts it outside of science. Climate modeling as now practiced is a liberal art wedded to its axioms.

    Turning now to your most recent arguments:
    In summary, they are uniformly mistaken because you,
    • created categorically distinct dimensions that disallow calculating either sums or means;
    • supposed again that error-interval impacts uncertainty;
    • violated long-established standards of meaning.

    Details: in your reply of 7 February, you wrote, “I must insist that the unit is W/m² not W/m²/year. … Your mistake is that you are not applying the average formula correctly. For the time-mean that we are discussing, the unit ‘year’ actually appears in the numerator as well as the denominator of the average formula:

    “sum( i = 1, n_years ; year_i * value_i ) / sum (i = 1, n_years; year_i ).”

    I’ll get to your excel screenshot example below, but will first clear up the dimensional mistake you’re making here.

    Your method dimensions cloud forcing (CF) for year-one into units of W/m²-year_1. Year 2 is then W/m²-year_2. You have converted the time-index into a dimension. The year-by-year dimensions are now not identical. They cannot be summed.

    Again: you have separated each year into a stand-alone dimensional category of one. Categorically disparate quantities cannot be summed.

    Year_1 ≠ year_2 because each year is rendered as dimensionally unique. Year_1 and year_2 are non-identical. They are separated by different time-identities. You have inserted the specific time-step into an integral part of the magnitude dimensional identity.

    How does one carry out the sum (W/m²-year_1) + (W/m²-year_2)? The years are now strictly differentiated time-dimensions and are thereby categorically different. Dimensionally disparate magnitudes cannot be summed.

    You have created an apples–oranges distinction, whereas summing requires identity.

    W/m²-year_1 cannot be summed with W/m²-year_2 because they are not magnitudes having the same dimensional unit. If one goes ahead anyway and sums the W/m² numbers, the disparate dimensions must be retained in the sum. The result is an average containing the dimension, e.g., ‘W/m²-year_1-year_2,’ and is meaningless.

    Your methodological 20-year average is now i*W/m²-year_1 + j*W/m²-year_2 + … + n*W/m²-year_20 divided by 20 years => [N*(W/m²-year_1-year_2- … -year_20)]/20 years = (N/20)(W/m²-year_1-year_2- … -year_20)-year⁻¹.

    None of the year_n’s cancel out with “year” because each one of them is both categorically distinct from each of the others and is distinct from “year as well; i.e., (year_n)/year ≠ 1 because year_n ≠ year.

    The whole thing is a mess.

    Your claimed W/m² final dimension is ad hoc, because stand-alone W/m² is not produced by your method.

    The identical problem attends your IQ example. Your approach imposes that IQ-person_1 is a categorical dimension, which is of different dimension than IQ-person_2. This is because person_1 ≠ person_2.

    The IQ numbers cannot be summed or averaged because each IQ-person_i is of categorically distinct dimension from every IQ-person_j. Once again, your method is apples plus oranges.

    Your requirement that each item be dimensionally unique forbids making any sum of them at all. Your method disallows addition, because each entry is categorically disparate.

    For the same reason, your method also disallows taking a mean, and ultimately makes nonsense of the entire enterprise.

    An example from Chemistry might make this mistake clearer. Suppose one weighs out some copper and, separately, some iron. Following your method, the weights are of dimension gram-copper and gram-iron.

    Your method disallows summing them, because gram-copper is not gram-iron. They are categorically different dimensional registers. How can the weighings be summed when the units are different? An average weight cannot be calculated. What is average gram-copper-iron? Can 5 Kelvins be summed with 7 quarts, to make 12 Kelvin-quarts? This nonsense is your method.

    To know grams of metal in total — a reasonable idea — requires dropping the elemental identifier so as to unify the dimensional units, i.e., to grams alone. Taking a mean is (i*grams+j*grams)/(2 weighings) = (x_bar*gram of metal)/weighing.

    Note also, that one never records a weight as gram-weighing_1, etc., so as to eliminate ‘weighing” from the dimensions of the mean. The entire idea is false and flies right into the face of long-established analytical meaning.

    If W/m² are to be summed, the year_i categorical dimension must be dropped. Otherwise, each flux magnitude is categorically unique and cannot be summed with anything else.

    Summing cloud forcing requires summing the same unit: W/m². Summing 20 years of CF, and dividing N*W/m² by 20 years, yields the average (W/m²)/year.

    Summarizing: your method incorrectly creates magnitudes of categorically disparate dimension, then improperly adds them, and finally artificially appends a categorically ad hoc dimension. Your method is apples+oranges = strawberries. It is nonsense.

    Ken Caldiera should be chagrined for making such a basic mistake as improper dimensional analysis. So should you be.

    Next, you wrote, “I tried to show this intuitively in the above excel screenshot which illustrates that the underlying temporal resolution of the data can be arbitrarily scaled up to any temporal resolution you want. Remember, the annual timescale is not the ‘native’ temporal resolution of either the climate model data or the observational data.

    A sum across 20 years, divided by 20 years is an annual average, of dimension per-year.

    You are right that one can average over any time frame one likes. I have already shown several times, however, that averaging the same errors over different time intervals has no significant impact on calculated uncertainties. “Native” has no bearing on anything under discussion.

    Your methodological examples always involved linear addition of errors. That procedure is always incorrect, and is never correct.

    Error cancellation by linear addition of offsetting values merely produces false precision.

    In your 5-year (60 month) example, you wrote, “If you lived in this [5-year] world, on what grounds would you argue that the unit for the ±4 value is W/m²/year rather than W/m²/60-months or W/m²/5-years? You wouldn’t have any grounds to argue that.

    The Lauer and Hamilton ±4W/m² is the rms annual (per year) uncertainty calculated from the mean annual error from each of 27 models. That is, the root-mean-square error is sqrt[sum over i=1→27(e_i)²/27], where e_i is the annual mean CF error of model i, in Wm⁻²year⁻¹. Twenty years doesn’t enter into the final rms error.

    Obviously reducing 20-years of data to 5 years just means summing 5 years of CF W/m² instead of 20 years. The annual rms error is then calculated as the sum/(5 years).

    For each model, the sum of five years of simulated CF in Wm⁻² divided by 5 years = average annual simulated CF in units of Wm⁻²year⁻¹.

    Likewise five years of observed CF in Wm⁻² is also summed and divided by 5 years, yielding the average annual observed CF in Wm⁻²year⁻¹.

    With only 5 years of data, the twenty-seven 5-year model annual mean errors might be different from the 20-year model annual mean errors, but the method of calculation is identical. Nevertheless, the mean annual error for each model “i” is just e_i = (model_i mean simulated Wm⁻²year⁻¹ minus observed mean Wm⁻²year⁻¹), as before.

    The 5-year rms error = sqrt[(sum over i=1→27(e_i)²/27] = ±Wm⁻²year⁻¹.

    Any given 5-year averaged annual rms error might be a bit different in magnitude from ±4 Wm⁻²year⁻¹ because the range of data is short compared to the original 20 years and therefore more sensitive to annual variability.

    However, the units remain identical, and the model rms error propagates to a large uncertainty. In restricting the average to 5 years of CF data, you have gained nothing. The impact on propagated uncertainty is negligible.

    You also wrote, “You wouldn’t have any grounds to argue that … the ±4 value is W/m²/year rather than W/m²/60-months or W/m²/5-years.

    This point is a little peculiar. Lauer and Hamilton calculated their final rms error using the 27 annual mean model errors. Working backwards from 27 models, using the average ±4Wm⁻²y⁻¹ rms annual error, and the original 20-year range, then a 5-year block-average rms error requires taking the sqrt of a 5-times larger per-model mean error. That is, using the mean model error in Wm⁻²/(5-years). Calculating through, the resulting per-model rms error = ±8.9 Wm⁻²5y⁻¹.

    This 5-year rms error yields an uncertainty of about ±4 C after a 5-year simulation time-step. After 100 projection years in 5-year time-steps it’s ±18 C, which is negligibly different from the original error-propagation result.

    So, in fact, you would have exactly grounds to argue the ±4 Wm⁻² is per annum, and per nothing else.

    Your entire argument in this section is vacated.

    In your ‘February 8, 2017 at 5:53 am’ example:

    Mph is a vector quantity. The vector average of 60 mph + 90 mph + 90 mph is just 1/3*(60+90+90)mph = 80 mph.

    However, your entire question employs a misguided analogy.

    The question implicit in the Lauer and Hamilton CF flux average is, ‘What is the average CF flux per year?’ The sum includes 20 years of W/m², which is just a magnitude of singular unit. Every entry of magnitude is dimensionally identical and thus able to be averaged across that time.

    Correctly analogized to flux, your question should be, ‘a traveler drives 60 miles the first hour and 90 miles in each of the next two hours, what is the hourly average speed?’ Once again the sum is over dimensionally identical units of magnitude.

    That answer is (60+90+90)miles = 240 miles per 3 hours yields 80 mh⁻¹.

    Your mph/hour is acceleration. Nice try, but one that does you no credit.

    Your final sentence, …

    The average speed does not have /hour OR /minute as the denominator because those units are implicitly in the numerator as well.

    … is falsified by your own “80mph” answer. “mph” = miles per hour = mh⁻¹. The average speed does, in fact, have hour in the denominator. It’s right there in your answer, but you overlooked it.

    Next, if we follow your method and specify the hour in the numerator, the speeds convert into 60mph-hour_1, 90mph-hour_2, and 90mph-hour_3, respectively. Your method again converted the entries into dimensionally disparate magnitudes, and they cannot be summed. They cannot be averaged. A mean cannot be calculated. The same mistake is present here as was made above by imposing ‘Wm⁻²-year_i’ as dimensional categories. The entire method is an exercise in nonsense.

    The Lauer and Hamilton 20-year flux average is the mean of a 20-year sum of a set of magnitudes of identical dimension, Wm⁻², and each model mean or the observational mean is Wm⁻²year⁻¹. Their difference as model error is also Wm⁻²year⁻¹. The rms of 27 model mean error values of Wm⁻²year⁻¹ is ±Wm⁻²year⁻¹.

    Your entire critique is voided.

    None of your arguments has withstood critical examination.

    • ptbrown31 says:

      Hi Dr. Frank,

      I was hoping that we would be able to have a productive scientific discussion on this topic but I am pretty pessimistic that there is much hope for that moving forward. One prerequisite for a truly productive discussion is that both parties are charitable and do their best to understand the substantive points being made by the other party. However, it seems to me that your primary goal is not to understand my arguments but rather to score ‘debate points’ by any means necessary. Specifically, you have a tendency to look past the substance of a point being made in order to create a straw man, destroy it, and then declare victory…all while exuding condescension. This may make you look intelligent and authoritative to some 3rd party observers but it does not actually make you any more correct.

      I have provided some thoughts on your most recent comments below but I feel that this discussion is now going in circles and it would probably be best to just bring it to a close.

      Addressing some specific comments…

      “it is evident from our conversation that you have not been exposed to calibration, nor to the necessity and use of calibration experiments, nor to the propagation of calibration error through subsequent measurements and experiments.

      This is a very, very serious deficiency in your training. You have every right to be highly upset with the negligent people and programs responsible for this state”

      This type of condescension is evidence to me that your goal is simply to make yourself sound as authoritative and intelligent as possible while painting your detractors as uninformed as possible. This is a superficial strategy that only makes sense if you just want to score debate points rather than progressing closer to the truth.

      “The average speed does not have /hour OR /minute as the denominator because those units are implicitly in the numerator as well.”

      … is falsified by your own “80mph” answer. “mph” = miles per hour = mh⁻¹. The average speed does, in fact, have hour in the denominator. It’s right there in your answer, but you overlooked it.

      This statement is a prime example of why I have lost my patience with this discussion. Please go read comment you are referencing again. It is perfectly obvious that I was referring to the 2nd “/hour” (mph/hour) and not the first “/hour” that is implicit in “mph”. However, it seems that you chose to ignore the true meaning of what I was writing because you saw an opportunity to claim that I had made a mistake.

      “Your method dimensions cloud forcing (CF) for year-one into units of W/m²-year_1. Year 2 is then W/m²-year_2. You have converted the time-index into a dimension. The year-by-year dimensions are now not identical. They cannot be summed…

      … You have created an apples–oranges distinction, whereas summing requires identity….

      … The IQ numbers cannot be summed or averaged because each IQ-person_i is of categorically distinct dimension from every IQ-person_j. Once again, your method is apples plus oranges….

      … if we follow your method and specify the hour in the numerator, the speeds convert into 60mph-hour_1, 90mph-hour_2, and 90mph-hour_3…”

      Ect.”

      This whole discussion of dimensional categories is a distraction. I was evoking the standard weighted mean formula:

      wm

      https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Mathematical_definition

      The “year_i” was meant to represent the weight for each value in the average. So each weight is 1 year, not year_1, year_2, ect. However, instead of being charitable and inferring that I was referencing a standard formula, you have (intentionally?) framed the argument as nonsense that you can easily dismiss. This seems like classic strawmaning and it is another example of why I have lost patience with this discussion.

      “If W/m² are to be summed, the year_i categorical dimension must be dropped. Otherwise, each flux magnitude is categorically unique and cannot be summed with anything else.

      Summing cloud forcing requires summing the same unit: W/m². Summing 20 years of CF, and dividing N*W/m² by 20 years, yields the average (W/m²)/year.”

      The year_i is not a “categorical dimension”. It is a weight for each of the flux values (which is just 1 year for each in this case). The simple time mean we are discussing is a special case of the weighted mean where the weights are identical for each value. The unit being summed in the numerator is the same. It is (W/m²)*(year). So the average has units of W/m².

      “Your methodological examples always involved linear addition of errors. That procedure is always incorrect, and is never correct.

      Error cancellation by linear addition of offsetting values merely produces false precision.”

      You can make these statements as definitely as you like but it doesn’t make them true.

      While it may be true that the RMSE is the proper way to characterize the error in space (as is done in Lauer and Hamilton), it is necessary to allow for error cancellation in time (also done in Lauer and Hamilton). This is because we are interested in how well climate models simulate the climate, not how well climate models simulate the precise timing of weather. Climate models cannot be expected to reproduce the timing of weather variability because the precise timing of such variability is chaotic and is not a function of the boundary conditions (external radiative forcings) that are driving the model. At best, you can only expect a model to reproduce the average weather over a long time period. Thus, it makes the most sense to calculate the error in time with linear addition.

      By the way, if I wanted to phrase the above point in the writing style that you have adopted for this discussion, I would say something like this:

      “Dr. Frank has made a fatal error in assuming that climate models can be expected to reproduce the timing of weather variability. His error betrays a fundamental lack of understanding of climate modelling and its purpose. It is never the case that small timescale climate model errors should be squared and summed in time. This is standard climate model evaluation. Dr. Frank’s mistake pervades and undermines his entire analysis.”

      Would using this tone make the underlying point any more or less correct? No, it would not. Taking this tone would just be unnecessarily inflammatory and counterproductive.

      “Obviously reducing 20-years of data to 5 years just means summing 5 years of CF W/m² instead of 20 years. The annual rms error is then calculated as the sum/(5 years).

      For each model, the sum of five years of simulated CF in Wm⁻² divided by 5 years = average annual simulated CF in units of Wm⁻²year⁻¹.”

      My hypothetical scenario did not suppose 5 years of data instead of 20 years of data. It supposed that the average over the full time period consisted of averaging four 60-month-long blocks rather than averaging twenty 12-month-long blocks. In that scenario, the average (the way you are interpreting it) would be a number “per 60-months” rather than “per year”.

      Anyway, we are just going in circles at this point so I think we have extracted all the value that we are going to get out of this conversation. I did find this dialog to be quite interesting and I thank you for engaging with me.

  15. Pat Frank says:

    Dr. Brown, I had a thought today about employing your time-as-dimension approach to averaging cloud forcing. As you have it, the numerator includes the year as a dimension.

    The first year of data in the course of taking a 20-year average of CF is then Wm⁻²-year_1. Let’s accept that for the sake of what follows.

    Year_1 is a time-unique dimension. There is only one of them. For every CF entry year_i ≢ year_j. There is only one cloud forcing for each unique year_i, Wm⁻²-year_i.

    The complete analytical description for the first year of CF data is that there is one Wm⁻²-year_1 per the unitary year_1. Written out, it’s (Wm⁻²-year_1)/(1*year_1).

    Following that through, the first year CF simplifies as (Wm⁻²-year_1)/(1*year_1) = (Wm⁻²)/(1) = Wm⁻². The same applies to each of the 20 dimensionally unique years going into the average.

    Applying the algebra and summing the 20 individual annual entries:
    [(a*Wm⁻²-year_1)/(1*year_1)] + [ (b*Wm⁻²-year_2)/(1*year_2)] … (n*Wm⁻²-year_20)/(1*year_20) => (a+b+…+n)*Wm⁻² = N*Wm⁻².

    Dividing N*Wm⁻² by 20 years = (NWm⁻²)/20 years = (N/20)Wm⁻²year⁻¹, and we’re back where we started with ‘per year’ in the annual average. The restraint to annual error is evident.

  16. Pat Frank says:

    Dr. Brown, the personal attack in your 14 February preamble is personally disappointing, is unworthy of you, and is unjustified by anything that came before.

    You wrote, “[I]t seems to me that your primary goal is not to understand my arguments but rather to score ‘debate points’ by any means necessary. Specifically, you have a tendency to look past the substance of a point being made in order to create a straw man, destroy it, and then declare victory…all while exuding condescension.

    The evidence of my posts does not support your accusation. They addressed your arguments point-by-point, in analytical detail, and even included citations to appropriate literature. Your “straw-man” accusation is categorically refuted by the detailed analyses present in my replies.

    I regret being a source of distress. However it is not condescension to observe that your analysis shows no sign of prior exposure to calibration or to error propagation.

    These and other points are taken up below. My summary conclusion is that your critique has not touched the error propagation analysis.

    1. Language
    Toward the end of your response, you objected to my use of “fatal error” as inflammatory language. I went through your video again, and culled out the following:

    12:36 A completely unphysical range of uncertainty
    12:39 totally not plausible
    12:48 implausible as well
    13:42 arbitrary use of 1-year compounding time-scale
    15:03 arbitrary use of 1-year compounding time-scale
    16:58 arbitrary decisions
    30:06 doesn’t make sense
    36:28: method is inadequate
    36:57 doesn’t make sense
    37:18 doesn’t make sense
    37:49 doesn’t make sense

    I didn’t take offense at your wordings. They are appropriate to an acute scientific debate. However, they do show you did not tip toe around strong language. “Fatal error” is no stronger a judgment than “totally not plausible.”

    I searched the blog page for my use of “fatal.” It does not occur in any post to you. It appeared only in the February 12, 2017 at 7:42 pm post to oneillsinwisconsin. His critique referenced error rather than uncertainty, which is indeed a mistake fatal to his argument.

    I am not upset at “fatal error” in your 14 February weather vs. climate criticism, either. Fair is fair, and a truly fatal error is a fatal error. There’s no use getting upset about a true diagnosis. The weather-climate distinction, however, is analytically irrelevant (discussed below).

    2. Condescension
    You accused me of condescension in observing that you have evidently not been exposed to calibration or to error propagation. “Evidently” is the key word.

    Evidence includes minute 12:35 of your video, where you observed that the ±17 C uncertainty obtained after 100 years of RCP 8.5 is, “a completely unphysical range of uncertainty, so it’s totally not plausible that temperature could decrease by 15 degrees as we’re increasing CO₂. And it’s implausible as well that temperature could increase by 17 decrees as we’re increasing CO₂ under the RCP 8.5 scenario. But as I understand it, this is the point Dr. Frank is trying to make.

    You clearly misinterpreted the ±17 C uncertainty as a physical temperature. It is not. Presuming so clearly evidences no understanding of uncertainty.

    You invariably confused the 20-year rmse “±” uncertainty of Lauer and Hamilton as a single-sign base-state constant offset error. This again evidenced no understanding of uncertainty.

    Nowhere did you exhibit any realization that the Lauer and Hamilton comparison of simulation and observation calibrates model accuracy, or that the annual average ±4 Wm⁻² LCF error is a calibration error that conditions CMIP5 simulations.

    Taken together, these mistakes make it very clear that you have never been exposed to calibration, nor to the necessity and use of calibration experiments, nor to the propagation of calibration error.

    It is not condescension to point out obviously mistaken thinking during a debate about science. I understand that the judgment is painful. I would hate to receive it myself. Nevertheless it is a valid criticism, and potentially constructive.

    3. Error Propagation
    For your reference Harvard U. provides a description of error propagation …

    https://www.cfa.harvard.edu/~scranmer/Ay201a/Data/uncert_pilman.pdf

    … that completely supports the approach I used. So does Bevington and Robinson.

    4.Means and Wiki
    Regarding means, you wrote that, “This whole discussion of dimensional categories is a distraction. I was evoking the standard weighted mean formula:

    x_bar = [sum over (i=1→n)](xᵢ*wᵢ)/[sum over(i=1→n)](wᵢ),

    https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Mathematical_definition

    In that equation, each xᵢ is a mean; i.e., the Wiki equation is for a mean of weighted means.

    Let’s find out if you had originally invoked that formula.

    Here’s what you wrote on 1 February: “The point is that the ±4 W/m² root-mean-square error does not have an intrinsic annual timescale attached to it. Its units are W/m² not W/m²/year. Thus, the choice to compound annually is arbitrary.

    Which you followed up on 7 February with, “Your mistake is that you are not applying the average formula correctly. For the time-mean that we are discussing, the unit ‘year’ actually appears in the numerator as well as the denominator of the average formula:

    sum( i = 1, n_years ; year_i * value_i ) / sum (i = 1, n_years; year_i )” (my bold)

    You presented your formula as a mean of magnitudes, not a mean of weighted means.

    Further, in the Wiki formula that you cited as providing your true meaning, all the w_i are dimensionless real-number positive weights. The w_i have no physical label. Your “year_i” is a physical label, which you retained in order to cancel the “year” denominator when calculating the mean. The Wiki w_i are not equivalent to your year_i.

    The Wiki equation has physical dimension only in the x_i. It does not reproduce your original formalism at all.

    You specifically made “year_i” part of the error dimension in the numerator so as to have it cancel the “year” in the denominator when taking the mean. For every “i”, that requires a numerator x_i*Wm⁻²-year_i.

    You now wrote, “The “year_i” was meant to represent the weight for each value in the average. So each weight is 1 year, not year_1, year_2, ect.

    So, let’s see if we understand this. Where “i” = 1, your “year_i” does not become your prior “year_1.” It now becomes, “1 year” instead.

    You have changed the meaning of “year_i” between posts. Now you claim I should have intuited this logical non sequitur, and are then aggrieved that I did not.

    4.2 Wiki and Contradiction
    Also: your present explanation is inconsistent with your previous formalism. You now say “year_i” is always identical to unity.

    That means when i = 2, year_2 also means “1*year.” Thus when i = 2, and “year_i*value_i” means 1*year_2*value_2,” you have adduced “1” as the weight for every year, and “year_2” is not a weight.

    But you wrote that, “The year_i is not a “categorical dimension”. It is a weight for each of the flux values (which is just 1 year for each in this case).

    We are now faced with the contradiction that “year_i” is a weight and i=1→20, except that “1” is always the weight.

    You first made w_i = year_i, referencing Wiki. You then made all w_i ≡ 1. The “year_i” is suddenly not a weight. Your explanation has become incoherent.

    You wrote, “The simple time mean we are discussing is a special case of the weighted mean where the weights are identical for each value.

    Except in your new approach, the unit weights are parachuted in by hand. They do not derive from the values of “i” in your original equation. You have inserted the unit weights without any warrant.

    Proceeding: “The unit being summed in the numerator is the same. It is (W/m²)*(year). So the average has units of W/m².

    You have now either disappeared all the “i” indexes for the years, or apparently claim that all “i” = 1. When did that happen? Previously, the “i” indexed the year, and increased serially from 1 to 20.

    Is that allowed — to disappear an index or to change a serially increasing index into a unit weight? Doing so does get you out of a tight fix, but it seems opportunistic and certainly violates good analytical order.

    The “year” in your W/m²-year the stems from your claim that the denominator unit is implicit in the numerator when taking a mean. This allows you to claim that an annual CF mean, “has units of W/m²,” rather than units of W/m²/year.

    +++++++++++++
    As an aside, a simple example will show the failure of this idea:

    John runs the first mile in 5 minutes, the second in 6 minutes and the third mile in 7 minutes. What is his average speed?

    Using the Brown/Caldiera approach to means, we include the denominator unit in the numerator. John isn’t running miles. He is running mile-minutes.

    Calculating the average speed, (1 mile-minute + 1 mile-minute + 1 mile-minute)/18 minutes = 1/6 mile. The Brown/Caldiera speed has no time dimension. John is running at a rate of 1/6 mile. Do not add “per minute” in your mind, because it cancelled between the numerator and the denominator. “Per minute” is not present.

    No denominator exists in a Brown/Caldiera mean. Their answer, 1/6 mile, is not a speed but is nonsense.

    By the same token, the global LCF averaged across a given year is W/m² not W/m²-year. The unit W/m² alone does not represent an annual mean of 20 years of LCF summed and divided by 20 years.
    +++++++++++++

    5. Strict Constructionism
    Continuing from above: consistency requires retaining the original meaning of the annual index. It then becomes clear that the multiplied value_i*year_i, makes the “year_i” once again a dimensional unit. My 11 February analysis follows. A series of values of dimension W/m²-year_i, i=1→20, cannot be summed because they are dimensionally disparate.

    Things become even stranger when we retain the original serial index, as we should to be consistent, and but allow your new conversion of the indexing “i” into weights. You originally sequentially indexed each year using “i,” where i = 1→20.

    Following from that, year_1*value_1 is produced for the first year, year_2*value_2 for the second, etc., right through to year_20*value_20 for the twentieth year.

    Your new weighting scheme converts year_i*value_i into 1*year*x_1 Wm⁻² for i = 1. That is, the year index “i” is now an integer weight.

    In that case, for i = 2, the “i” weighting for “year_i*value_i” should then be “2*year,” not 1*year, because i = 2. Proceeding, when i = 20, the i-weighting should be “20*year.”

    That means, according to your assigned w_i = “i,” each Wm⁻² is multiplied by its year-index. Year 1 = 1*year*x_1 Wm⁻² = 1*x_1 Wm⁻²-year.

    Likewise, year 2 = 2 year*x_2 Wm⁻²-year = 2*x_2* Wm⁻²-year and finally year 20 = 20*x_20 Wm⁻²-year.

    Your new “i” = “w_i” explanation multiplies each annual CF by the index number of that year, right up through 20. The index = weight reinterpretation has produced a new kind of nonsense.

    5.2. Constructing Intentionality
    Continuing: you wrote, “However, instead of being charitable and inferring that I was referencing a standard formula, you have (intentionally?) framed the argument as nonsense that you can easily dismiss.

    Your original numerator, “year_i*value_i,” guided my interpretation. Your original formalism multiplied “year_i” by “value_i.” They are not added.

    As an index, year_i*value_i = year_i-value_i = value_i-year_i, which exactly as I represented it.

    You never mentioned weights. Lauer and Hamilton used no weights in their mean. Where is any possible inference about weights?

    My interpretation followed directly from your notation. But you imply dishonesty.

    Now we are told that your “i” index was meant to be an integer weight, “i” = “w_i”.

    However, nowhere is the “i” of year_i distinguished in meaning from the “i” of value_i in “year_i*value_i.”

    You have left these “i’s” to take identical meanings. Should we not proceed in that manner?

    Consistency of “i” = “w_i” notation then requires that “i” multiplies twice through your numerator. So “year_i*value_i” really means ‘year*i*value*i’ = ‘i*i*year*value,’ where i = 1→20. The new “i” weighting scheme make yearly values increase by the square of the “i” index.

    My analysis of your work directly followed from the original meaning of your February 7 insertion of dimensional “year_i” into the numerator. In that case, x_i*Wm⁻²-year_i cannot be summed with x_j*Wm⁻²-year_j because the year_i, year_j dimensions are not identical, and the enforced sum of the x-values produces nonsense.

    Lauer and Hamilton themselves described their calculations as, “comparisons of the annual mean cloud properties with observations, i.e., the annual mean CF properties are Wm⁻²y⁻¹, and in the legend of Figure 3, describes the “20-yr annual average performance of the … CMIP5 models.” (my bold)

    An annual mean is the per-year average. It is a representation of the magnitude of error to expect in any arbitrary simulation year.

    For your reference, here’s a nice explanation for the taking of means.

    https://www.varsitytutors.com/gre_math-help/how-to-find-arithmetic-mean

    Not one example that includes dimensions puts the dimension of the denominator into the numerator, in contradiction to your requirement. Nor do denominators cancel somehow, nor is any dimensional average denominator-free. All the examples with dimension in the denominator produce means of notation (denominator)⁻¹.

    See also the example given under section 4.2.

    6. Lauer and Hamilton and Wiki
    Next, your Wiki equation is incorrectly chosen. Lauer and Hamilton never took a mean of weighted means.

    Lauer and Hamilton took direct annual means of CF, mean = [sum of x_i over (i=1→20)/20 y] for observed and simulated CF, where “i” is just an index, x is in Wm⁻², and mean is dimensioned Wm⁻²year⁻¹. They calculated each mean model error e_i = (Wm⁻²year⁻¹_sim minus Wm⁻²year⁻¹_obsvd) = Wm⁻²y⁻¹. They then calculated the rmse as sqrt[sum over (i=1→27)(e_i)²/27] = Wm⁻²y⁻¹.

    In their equation the x_i are the Wm⁻²_i, where “i” is just an index of the year (not dimensional) and the w_i are trivially equal to 1.

    In short, the Wiki equation is not your 7 February equation. Your Wiki explanation is not an explanation. It is a mistake, at best.

    7. Climate vs. Weather
    The LCF error in Lauer and Hamilton are annual means taken from 20 years of data. Twenty combined years of data have nothing to do with weather, and everything to do with climate.

    You wrote, “While it may be true that the RMSE is the proper way to characterize the error in space (as is done in Lauer and Hamilton), it is necessary to allow for error cancellation in time (also done in Lauer and Hamilton).

    When simulating a future climate, how does one calculate error? There are no physical observations against which to test the simulation. Any such observables lay in the future. No one knows what they are or will be. No one can know what they are or will be. No simulation error can be calculated. One does not know the errors. One can’t know the errors. How, then, should one add the errors?

    In the circumstance of unknowable error, one turns to model errors determined from calibration experiments. For climate models, these experiments are hindcasts judged against past observables. The Lauer and Hamilton LCF rmse calibration error is a fine example.

    The model errors derived from the calibration are then propagated forward through the simulation of future climate to evaluate the uncertainty in the projected observables. This is standard practice in the physical sciences. It is never done in climate modeling.

    Lauer and Hamilton’s LCF rmse was calculated through time, and the method they used allowed positive and negative errors to cancel in the mean.

    Let a 20-year mean LCF observable = O_m = (o_1 + o_2 + … + o_20)/20 years = Wm⁻²y⁻¹.

    Likewise for any model, mean LCF simulation = S_m = (s_1 + s_2 + … + s_20)/20 years = Wm⁻²y⁻¹.

    Model mean error = e_m = (S_m – O_m) = (1/20)*[(s_1 + s_2 + … + s_20) – (o_1 + o_2 + … + o_20)] = (1/20)*[(s_1 – o_1) + (s_2 – o_2) + … + (s_20 – o_20)] = (1/20)*(e_1 + e_2 + … + e_20) = Wm⁻²y⁻¹.

    The model 20-year mean error is obtained from the sum of errors across time, just as you required.

    The Lauer and Hamilton LCF rmse is then sqrt[sum over (i=1→27)[(e_m)ᵢ]²/27] = ±Wm⁻²y⁻¹ for 27 CMIP5 models and is a representative CMIP5 uncertainty in simulated annual tropospheric thermal flux.

    You wrote, “ My hypothetical scenario did not suppose 5 years of data instead of 20 years of data. It supposed that the average over the full time period consisted of averaging four 60-month-long blocks rather than averaging twenty 12-month-long blocks. In that scenario, the average (the way you are interpreting it) would be a number “per 60-months” rather than “per year”.

    I answered this question a few lines after the text you quoted, and repeat the answer here: a 5-year block-average rms error requires taking the sqrt of a 5-times larger per-model mean error. That is, using the mean model error in Wm⁻²/(5-years). Calculating through, the resulting per-model rms error = ±8.9 Wm⁻²5y⁻¹.

    This 5-year rms error yields an uncertainty of about ±4 C after a 5-year simulation time-step. After 100 projection years in 5-year time-steps it’s ±18 C, which is negligibly different from the original error-propagation result.

    Nothing changes when using a per-60-months error.

    To conclude, your critique has not touched the error propagation analysis.

    Finally, Dr. Brown, thank you for your interest, and for carrying on your critique in a polite and civil manner. I do appreciate that. I also truly regret any hurt feelings. None were intended, and best wishes.

    • Pat,

      Calculating the average speed, (1 mile-minute + 1 mile-minute + 1 mile-minute)/18 minutes = 1/6 mile. The Brown/Caldiera speed has no time dimension. John is running at a rate of 1/6 mile.

      This seems ridiculous. He runs 3 miles in 18 minutes. His speed is therefore (3 miles)/(18 minutes) = 0.17 mile/minute. Of course his speed has a time dimension, it’s defined as distance/time.

      • Pat Frank says:

        You clearly haven’t followed the discussion, AT.

        According to Dr. Brown, stated twice in his February 7, 2017 at 9:56 pm post, and repeated in his February 8, 2017 at 5:53 am post, the denominator in a mean is implicitly also present in the numerator.

        According to Dr. Brown, it therefore cancels out and disappears.

        From the 7 Feb. 9:56 pm post apparently Prof. Caldiera subscribes to this view as well. So, for them, speed has units of distance only.

        For the rest of us, it’s distance/(unit time).

        With you unfairly deleting my posts at ATTP, by the way, I’m unlikely to post there anymore.

      • ptbrown31 says:

        If you read my posts it’s clearly not my view that speed has units of distance only.

      • Pat,
        The idea that you could think that Patrick Brown was suggesting that speed only has units of distance is bizarre.

        With you unfairly deleting my posts at ATTP, by the way, I’m unlikely to post there anymore.

        Actually that was the person who helps me moderate my blog, so I’m not quite sure what was deleted. However, he’s often pretty even handed – even I get my things deleted now and again.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s