Chemical Mass Balance Technique
Overview

A chemical mass balance (CMB) receptor modeling approach is used to quantify the source contributions to fallout particulate at a receptor. This method is based on direct measurement of the chemical composition of fallout particles present in the area of interest. The relative apportionment of these chemical species between potential sources is based on a statistical comparison of a chemical profile or "fingerprint" of each source with the chemical profile of an ambient fallout particle sample.

With this "fingerprinting" approach, impacts are based on retrospective measurements of samples selected from a specific period of potential maximum impact. Results represent the most probable quantitative source impacts for each specific sample selected.

The Chemical Mass Balance Method

The relationship between particulate emissions and ambient fallout concentrations measured at a receptor (pollutant sampler) site distant from an emitting source is a complicated one. Many variables, primarily meteorological, make the direct correlation between source emissions and ambient concentrations a poor one. Each of these variables is random in nature, will vary with space and time, and may combine with other variables in a nonlinear manner. Thus, any estimation of source contribution to fallout particles based on emissions and meteorology is approximate at best. However, the chemical mass balance (CMB) receptor-oriented model is a comparatively simple "model" based on physical principles which can be used to determine the average contribution of specific sources categories to particulate fallout. This model is based on the conservation of relative aerosol chemistry from the time a chemical species is emitted from its source to the time it is measured at a receptor. That is, if p sources are emitting Mj mass of particles, where m is the total mass of the particulate collected on a fallout tray at a receptor site, the model assumes the mass on the fallout tray is a linear combination of the mass contributed from each of the sources.

The mass of a specific chemical species, mi, is given by the following:

Equation 1

 

 

where Mij is the mass of element I from source j and FNij is the fraction of chemical species I in the mass from source j collected at the receptor. It is usually assumed that:

Equation 2

 

where Fij is the fraction of chemical I emitted by source j as measured at the source. The degree of validity in this assumption depends on the chemical and physical properties of the species and its potential for atmospheric modifications such as condensation, volatilization, chemical reactions, sedimentation, etc.

If we accept this equation, however, and divide both sides of Equation 1 by the total mass of the deposit collected at the receptor site, it follows that:

Equation 3 (3)

or,

Equation 4

 

 

where Ci is the concentration of the chemical component I measured at the receptor and Sj is the source contribution, i.e., the ratio of the mass contributed from source j to the total mass collected at the receptor site. In practice, it is this fraction of particulate pollution measured at a receptor due to source j, Sj, which is of primary interest in receptor modeling calculations.

If the Ci and the Fij at the receptor for all p of the source types suspected of affecting the receptor are known, and p < n (n = number of chemical species), a set of n simultaneous equations exists from which the source type contributions Sj may be calculated by least squares methods.

Application of the CMB Modeling Method

In a typical chemical mass balance application, EPA's Version 7.0 CMB model (EPA, 1990) is applied to selected ambient samples. The CMB receptor modeling is performed in a manner consistent with EPA's Protocol for Applying and Validating the CMB Model (EPA, 1987).

The CMB procedure begins with a set of linear equations which expresses the ambient concentrations of chemical species measured at an ambient receptor site as the sum of products of source compositions and source contributions. This set of equations is over-determined (more than one possible solution) because the number of chemical species exceeds the number of contributing source types. The source contributions are the unknowns in these equations. However, a unique solution cannot be found for this set of equations because measurement uncertainty precludes determination of exact values for source and receptor data. When these uncertainties are estimated for both source and receptor measurements, additional physical constraints are applied which yield a most probable solution. This solution minimizes the difference between calculated and measured receptor concentrations by using an effective variance weighting scheme. The weighting has a physical significance in that it is derived from the measurement uncertainties of both source and receptor chemical species. (Species with higher relative concentration uncertainties carry less weight in the regression than species with lower relative uncertainties.) Although the CMB solution is identical to some statistical inference methods, it is not dependent on statistical principles. The basic model equations which represent the source receptor relationship, the effective variance weighting, and the error propagation are all based on physical principles.

The CMB provides a source contribution estimate (SCE) and associated standard error uncertainty (STD ERR) for each source category. The model produces these estimates by making an effective variance weighted least squares fit between the chemical composition of the ambient sample and the composition of the sources. It estimates what amounts of each source (the SCEs) will collectively best explain the chemical composition of the ambient sample.

There are five basic data types necessary for CMB modeling:

The ability of the CMB model to achieve a proposed set of apportionment goals is determined before the data is input into the computer. In other words, the chemical composition of the source profiles and ambient aerosol are established before the model is applied. At the time of data input, the only options available are the selection of source profiles and the source category names to associate with the profiles.

There are four major steps involved in applying the CMB receptor model to an existing database:

The appropriateness of a data set for CMB modeling must be determined before the CMB model is applied. There are no quantitative rules that can be used. However, the EPA suggests using the following criteria as a guide (EPA, 1987):

Once it is determined that application of the CMB model is appropriate, it can be applied at varying levels of complexity. The EPA arbitrarily separates these into three levels. Level I uses existing data or data that can easily be obtained from analyses of existing samples. Level II involves additional analyses on existing samples or the acquisition of additional samples. Level III is a comprehensive CMB analysis and includes the acquisition of new data from both ambient and source sampling.

The process of CMB analysis consists of selecting the optimum solution to the effective variance least squares regression using the following seven steps:

Although there is a degree of subjectivity in this selection process, much of the subjectivity is removed if the fitting protocols and goodness-of-fit statistical criteria recommended by the EPA are used. The first step is to include all the sources or representatives of all source categories and all defined key species in the initial CMB analysis. Examination of the statistical goodness-of-fit criteria resulting from this initial analysis is used to evaluate the quality of the source contribution estimates. Based on this examination, a different set of sources and species is selected and evaluated. This stepwise procedure continues until, based on the following criteria, an optimum fit is obtained:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The model provides three primary outputs: the contribution estimates to ambient concentrations of the sources or source Categories which are included in the fit (SCE), the standard errors of these source contribution estimates (STD ERR), and the species concentrations calculated from the fit (CALC).

The model provides three statistical measures which can be used to evaluate how well the model's calculated species concentrations match the ambient measurements for these species. These statistics are the percent of total mass explained by the fit (% MASS), R-SQUARE, and CHI-SQUARE. It is generally desirable to obtain a good fit of the data based on these three measures while obtaining SCEs with low STD ERR relative to the size of the SCE.

The model provides four diagnostics to help identify data responsible for a poor fit so that improved data might be obtained or included to rectify the situation. These are the uncertainty/similarity clusters (U/S CLUSTERS), the ratio of calculated to measured species concentrations (RATIO C/M), the ratio of the residual (calculated minus measured) to the uncertainty of this difference (RATIO R/U), and the portion of a calculated species concentration that is attributed by the model to each source (SSCONT). The latter diagnostic is not included on the standard CMB printout.

There are four main error categories that can impact model performance: incorrect ambient data, incorrect source profiles, incorrect source list, and profile uncertainty/incorrect collinearity. The existence of these errors can be inferred from the diagnostics and indicators listed above. Possible corrective actions include evaluating ambient and source data, reanalyzing samples, including different sources in the source list, deleting sources from the source list, compositing collinear source profiles, analyzing samples for additional species, etc. After corrective action has been taken, the fit of the measured species data is reevaluated.

When statistically sound and physically reasonable fits have been obtained for the ambient samples of interest, the stability of the CMB model results are assessed. This includes the evaluation of the sensitivity of the model's results to errors in the sources, source profiles, and the ambient data. The final step in the application of the CMB model is validation. In this step, the model results are evaluated for their consistency with available related data (e.g. meteorological, spatial, emissions, and particle size data). Comparisons are made with the results of other receptor and/or dispersion models, if available.

When the summary statistics and diagnostics are generally within target ranges, when there are no significant deviations from model assumptions, when the sensitivity tests uncover no unacceptable instability or consistency problems, and when the results are consistent with available related data, the CMB analysis is considered complete and valid.

Using the fitting parameters in Table 1 and the EPA guidelines, this modeling procedure will generally result in optimized source contributions. The resulting fit is only one of many possible solutions, but it should be the most probable solution. The existence of several different solutions with similar fitting parameters suggests similar probabilities of correctness for each set of source contributions. In such a case, the SCEs of the major sources will likely be quite similar.