The methodology triangulates transactional and operational business data to estimate economic values, frequently where government statistics are not available. It can estimate the sales and employment in the green economy, the share of the country’s economy taken up by the green economy, growth in the green economy and the green economy sectors that are leading that growth. This can estimate the contribution to the country’s economy of the green economy, the progress made and national priority areas. The methodology, developed by kMatrix Ltd, uses a number of different data sources and data types (transactional, procurement, insurance, industrial benchmarking) to arrive at estimates of economic value that would not be possible from a single data source. Each data point requires at least 7 data sources for ‘triangulation’, but in the Low Carbon and Environmental Goods and Services Sector (LCEGSS) dataset, the average number of data sources for each observation is 56. The transactional triangulation methodology has been used to: estimate climate change adaptation within ten megacities (Georgeson et al., 2016b), provide data on global private sector investment in clean energy R&D (Georgeson et al., 2016a), analyse global provision of climate and weather information (Georgeson et al., 2017a), and estimate global climate change adaptation spending relating to health (Watts et al., 2017). It has also been assigned official statistics status in order to provide trade statistics to the UK Government’s Defence and Security Organisation (Department for International Trade Defence and Security Organisation, 2015).

The transactional triangulation methodology also measures supply chain activity. Transactional data has advantages for measuring full economic impact, but it is not directly comparable to national statistics. A ‘core versus supply chain’ analysis has been conducted on the approximately 3800 activities in the LCEGSS dataset; the ratio is 45% core to 55% supply chain. Data collection therefore includes both activities by companies that are specialists in LCEGSS and non-specialist companies that operate within the value chain.

Definition of the low carbon and environmental goods and services sector

LCEGSS uses a wide range of different data types and sources, and a sectoral definition that is both ‘top-down’ and ‘bottom-up’. It is a pragmatic estimate of the green economy that collects and measures data only where sufficient evidence is available to support inclusion. The definition was originally developed in conjunction with early efforts by the UK government to define ‘Environmental Technologies’ in 2007 in a response to the limitations of the UK industry classification system to accurately estimate the economic value of environmental protection within the UK economy. The development of the dataset was tested against ‘known’ sectors with existing SIC values to test its consistency with SIC-derived values. The ‘Environmental Technologies’ dataset aimed to identify environmental technologies across a range of sectors, which were primarily related to environmental protection, with less emphasis on low carbon activities. The development of the dataset in subsequent revisions led to a sectoral definition that covered environmental protection, renewable energy and low carbon activities. This revised definition, renamed the Low Carbon and Environmental Good and Services (LCEGS) dataset, was used for UK national reporting by the UK department for Business, Innovation and Skills (BIS) between 2008/09 and 2012/13. Subsequent revisions of the definition analysed new sources of data and new economic activities. The revised dataset has been renamed the Low Carbon and Environmental Goods and Services Sector (LCEGSS) to better reflect efforts to better align the environmental protection, renewable energy and resources management sections of the definition with Eurostat’s EGSS. It has been used for research purposes in partnership with the C40, amongst others.

LCEGSS contains 26 sub-sectors (described as ‘Level 2’ in the definition’s taxonomy), which are grouped into three broad categories: Environmental, Low Carbon and Renewable Energy (see Table 1, and a more detailed version in the Supplementary Materials). The Environmental and Renewable Energy sectors largely represent distinct sectors within the broader economy, whereas the Low Carbon sector contains a number of economic activities that exist in a range of traditional industries.

Table 1 LCEGSS classification (Levels 1 and 2) Full size table

The LCEGSS definition covers 3800 discrete goods and services (described as ‘Level 5’ in the taxonomy), which are derived from sector supply chain activities (such as componentry and assemblies) and value chain activities (such as R&D, supply and training). The revisions to the LCEGSS definition added 953 activities, both through economic activities that have been identified and added to the definition, and the identification additional data sources that allowed the inclusion of economic activities in data collection. Seven hundred and seventy-eight of these relate to Energy from Waste, 49 to Biodiversity, 40 to Environmental Consultancy, 25 to Water and Waste Water Treatment, and 61 are split across a further eight subsectors. Other major revisions include dividing offshore and onshore wind to reflect their differing supply chain activities. To illustrate how the taxonomy functions, an example of the data taxonomy and values for Air Pollution Control for the US for 2015/16 is available in the Supplementary Materials.

The development process for the LCEGSS definition has reflected the lack an internationally agreed definition of the ‘green economy’ or related sectors. In defining a new sector for measurement, decisions are required where no agreed boundaries for inclusion exist. However, internal quality assurance processes ensure internal consistency in the definition, and the methodology has been externally peer-reviewed or audited on a number of occasions, most recently in January 2017. In defining the boundaries of LCEGSS, decisions had to be made on the inclusion and classification of particular activities. For example, the definition of geothermal energy increasingly refers to both ‘deep vertical’ and ‘shallow horizontal’ heat sources. The highest growth in the sector is generated in horizontal applications at a one to two-metre depth, principally for private dwellings, which contributes to the size of the geothermal energy subsector in LCEGSS. At the city-level for example, shallow geothermal applications account for between 93 and 100% of the Geothermal subsector value in the LCEGSS dataset. By comparison, other ‘green economy’ categorisations measure certain shallow geothermal applications under ‘Renewable Heat’ or ‘Construction’ (alongside HVAC).

In the ‘Low Carbon’ sector, the LCEGSS definition includes industries where low carbon measurement is practical and some consensus exists around what should be included, such as low carbon activities within industries that account for high levels of carbon emissions, such as Building Technologies and Energy Management from Construction, and Electric Vehicles from Transport. Other industries are included because of their significance in responding to climate change, such as Carbon Finance from Finance and Insurance and Environmental Consulting from Professional Services. The historical development of LCEGSS in the UK meant that some subsectors, such as Carbon Finance and Nuclear Power, were included due to preference and policy relevance in the national context. LCEGSS does not currently measure low carbon activities from all industrial sectors. Current and future research is developing a more comprehensive method for classifying and identifying green and low carbon activities across a wider range of traditional industries.

Compiling and classifying economic activities and data

The process of compiling the LCEGSS definition for measurement was iterative and both ‘bottom-up’ and ‘top-down’. The first stage was to search for data sources for activities that fit the ‘ideal’ definition of LCEGSS. Then, based on the robustness of available evidence, the decision was taken to include or omit aspects of the ‘ideal’ definition. The resulting definition is therefore pragmatic and only includes economic activities for which multiple sources could be identified.

The deployment of this methodology enables the reporting of comparable estimates of LCEGSS activities from multiple countries. LCEGSS can be applied across multiple geographies (both between nations and at a subnational level) by triangulating international and national data sources. In addition, using a wide range of data types affords a better understanding of each activity. This has benefits for the identification of economic activities, the ‘in or out’ definitional decisions and the classification within the LCEGSS taxonomy. For example, the use of procurement data can assist in better identifying the ‘purpose’ of a product or service.

Economic activities were only measured where there was a ‘footprint’ of economic activity, not economic potential. Green economy sectors with a high potential for future growth but no currently measurable sales activity cannot be measured. Therefore, some subsectors with significant potential, like Wave & Tidal, are measured based upon their current market presence. This approach may understate their future value but does not inflate LCEGSS values based on early stage investments that may not succeed at scale.

With definitions established, rule sets and decision trees were written to filter source data to be included in LCEGSS measurement. Rule sets and filters determine what proportion of an economic activity is included in the sector analysis. In some cases, this is straight forward; activities that are clearly and directly associated with renewable energy sources like Wind, Geothermal, and Wave and Tidal are automatically included in the relevant LCEGSS sub-sector. Some accompanying economic activities are more difficult to allocate, such as the engineering support services that are part of the wind energy supply chain, but that may be located within data sources relating to general engineering sector. Filters determine the environmental characteristics of different products, components or materials, such as those that can be identified to save energy, reduce heat loss, use less raw materials, produce less waste, or assist companies to meet environmental standards. Filters also assess the end-use of more generic products or services to determine their inclusion in LCEGSS. For example, filters would be used to assess whether the economic value of road maintenance is due to routine wear and tear from traffic volumes (planned) or in response to new weather, climate, or environmental conditions (unplanned and additional).

Multiple filtering processes are required in differing combinations across the LCEGSS classification to filter relevant activities. The transactional triangulation methodology assesses ‘how’ or ‘why’ an activity is carried out and ‘where’ it is used, whereas industry classification systems classify based on ‘what’ an activity is. In the methodology, the interrogation of additional data sources (both additional sources and a variety of data types) permits this improved assessment and classification. For example, the use of procurement data can improve assessment and classification of the end purpose of a product or service. For example, for an indicative, simplified set of filters that relates to climate change, if activities meet fulfils one of a set of purposes or needs that can be identified as strictly related to mitigation, then they are included for further filtering within LCEGSS. This methodology has been previously used to estimate spending on ‘Adaptation & Resilience to Climate Change’ (Georgeson et al., 2016b); therefore, if an activity can be identified as strictly related to adaptation then it is included for further filtering in that dataset.

The process also uses technology filters for LCEGSS that assess whether a particular technology or process can be identified as relevant for inclusion in LCEGSS. Data to inform progress through the filtering process are drawn from market, technology, supply chain and procurement sources, although technology filters are not relevant to all LCEGSS activities. The technology filters include decision gates like:

Is the new technology or process an immediate and beneficial replacement in reducing resources, reducing emissions or reducing energy consumption?

Does the new technology or process provide robust short or medium term environmental benefits?

Does the new technology or process provide a solution to new requirements in law or regulation?

Filters also provide sufficient confirmatory evidence of end purpose (generally through procurement-related data) to determine whether a product or service has been used for an environmental purpose. More detailed disaggregation of product or service data is frequently required to ensure that only LCEGSS-related activities are included and to prevent over-reporting due to the inclusion of non-environmental activity value. This requires interrogating data values at a level of disaggregation greater than ‘Level 5’ in the taxonomy (which is equivalent to a product, service or economic activity) to filter out non-LCEGSS-related value from economic activities. This disaggregation is beneficial to the measurement of economic sectors with hard-to-define boundaries (through overlap with other sectors) or hard-to-define content (activities that may include both relevant and invalid purposes). This level of data mining was also necessary for measuring climate change adaptation and weather and climate information services using the transactional triangulation methodology (Georgeson et al., 2017a, 2016b).

Data acquisition

The data acquisition methodology is based on a system originally developed at Harvard Business School for triangulating transactional and operational business data to estimate economic values in areas where government statistics and standard industry classifications are not available (Jaikumar, 1986). This system, referred to as ‘profiling’, takes approaches from business intelligence and related fields to track technological and industrial change. It has been established within business intelligence literature (and related fields) that there are significant volumes of information for compilation and aggregation to analyse markets and industries, but this information is often dispersed and unsorted (Zanasi, 1998). Attempts to define these approaches have often taken process-based or demand-driven frameworks as the historical development of these processes was as an input into corporate decision-making (Baars and Kemper, 2008; Jourdan et al., 2008; Lackman et al., 2000; Pirttimäki, 2007).

The filters, rules and decision trees used to select relevant LCEGSS activities are central to data acquisition; the accuracy of the rules and the availability of sufficient robust and reliable data are the basis for estimating economic values. A five-step process is outlined in Fig. 1; the five broad stages are the framework for the specific steps of the methodology detailed below. The data triangulation process takes large quantities of unstructured, singular and fragmented data to construct the dataset of LCEGSS value estimates.

Fig. 1 The data acquisition process for the transactional triangulation methodology Full size image

Figure 1 suggests a linear process, however there is a degree of iteration to data acquisition. The definitions and data collected must be tested and validated during the process. Unlike a SIC-based approach, the transactional triangulation process involves the definition of the activities to be measured. An iterative process, which allows for feedback and adjustments, is therefore necessary. Through these methods, the transactional triangulation methodology is capable of tracking changing and emerging industries; the LCEGSS definition has been revised and extended more than once since data collection began in 2006/7. This is important for measuring the green economy; by comparison, the process of publishing the Eurostat EGSS definition took over 10 years and limitations to the classification of ‘Resource Management’ remain.

The data triangulation methodology and the underlying data used to produce LCEGSS data have some characteristics that are typical of ‘big data’ approaches (Gandomi and Haider, 2015): higher volume, higher velocity, and high variety. It uses a significantly higher number of sources than other approaches, processes data more quickly than survey-based approaches, and handles data from a variety of sources in a number of different types. It is not, however, directly comparable with values derived from estimations produced by national statistics agencies.

For each transaction listed in the LCEGSS dataset, a minimum of seven separate sources must independently record the transaction for it to be confirmed and included in our database. Across the entire LCEGSS database, the average number of sources for each data point is 56. At the country/territory level, the average number of sources for each transaction ranges from 52 (Faroe Islands) to 215 (Australia). These databases have been tracked in a Data Management system and their continued relevance and utility has been verified over a number of years. Sources are screened to remove duplicate references to a single source and then shortlisted by removing outliers and unreliable sources. This shortlist is then screened again to stress test inconsistent values and remove them if necessary. From the remaining sources, a value is estimated. These estimates are ‘reality tested’ by comparing activity values within and across economic or industrial sectors or, where available, with recognised industry benchmarks and government statistics.

Much of this data is already in the public domain, although it requires the corroboration of multiple sources and triangulation between different sources (financial, legal, academic, industry, trade association, procurement, government) before it can be validated and transformed into more usable data. The triangulation process and use of proxy data demonstrates two key characteristics of ‘big data’ research methodologies (high volume and high variety). The methodology can either;

a. select from multiple sources of pre-existing data (mature sectors), b. select from more limited sources of pre-existing data and combine this with triangulated data to achieve more robust results, c. find no pre-existing sources and uses triangulated data to create the sources necessary for analysis (emerging sectors).

As an example, for one historical data point for services relating to corporate governance for climate change, the consulting sector data reported that in 2010/11 250 major corporates commissioned work (the consulting sector data frequently does not report values for commercial reasons). Investor relations and fund management sector data reported that overall £8.75 m was spent on work and trade associations data reported independently that some £9.2 m has been spent. Along with additional sources, triangulating data from these multiple sources is the basis for deriving more accurate estimates of the value of this economic activity. A more detailed example of the value estimation process is available in the Supplementary Materials.

Data sources

Given the range of industries and sectors covered, a wide range of data sources is required. The data sources include a wide range of local, national and international sources that have been commissioned, and relevant published data and research. Where other green economy studies may have used a single one of these sources (Yi and Liu, 2015), for LCEGSS the sources include:

a wide range of industry/trade associations (from major national and international industry associations to federations and trade bodies for specialised sectors and manufacturers, including the Solar Trade Association),

financial institutions (such as Standard and Poor’s, national and international banks),

company data (such as Dun and Bradstreet, FAME),

market research organisations and journals (such as Bloomberg, Data Monitor, Frost and Sullivan),

professional services and organisations (including Institutes, Chartered Bodies, Societies, such as the Chartered Institute of Water and Environmental Management),

government agencies (such as national statistics agencies),

academic sources (such as Harvard Business School, MIT Bench Series);

and industrial benchmark information (such as Data Monitor, DTM Corporation, kMatrix’s in-house benches).

A total of 1589 data sources are used across the LCEGSS dataset. The process uses general and specific sources, but it is weighted towards sector-specific sources. The number of sources used to compile a single data point (the estimated Sales value) for each of the 3800 lines of economic activities in the LCEGSS definition for each city or country is calculated and collated. The triangulation of data from multiple sources contribute to reducing the impact of biases inherent in certain sources of data. To further minimise this, all sources are tracked and managed for accuracy and reliability over time. New sources of data become available regularly, but these are then subject to an ‘incubation’ period within the data management system. This establishes the frequency (of relevance) and credibility of the source before it is included in any analysis.

Monitoring new and existing sources enables the quality of data sources to be improved over time; new sources are monitored for inclusion and older sources are removed if their reliability deteriorates. For each source, a historical log records source name, source value, year it relates to, the number of times used, ‘hit rate’ (confidence or reliability) and whether it will be accepted for a specific research purpose. The source management datasets relate to each calculated value within any sector data. These data sources are monitored closely internally and are routinely spot-checked each year, and reviewed by data users as part of any peer-review or audit. There is a separate data management system for each sector in the data collection, as sources can be relevant to multiple sectors. Once added into the data source management systems, data sources are tracked and assessed for each individual data collection purpose.

For LCEGSS, revenue data are produced to an average ‘confidence range’ of 85%; and employment data are produced to an average confidence range of 83%. Confidence ranges are a function of the range of source values assembled for each data point. Each final data point is the mean of the final range of values (after outliers are removed). The confidence range is the difference between the mean value and the most extreme values in the range. An 85% confidence range means that the difference between the mean and the extreme values is 15%. Data estimates were returned for 226 countries and territories.

Employment

Employment values in LCEGSS are a measure of the estimated employment numbers across all aspects of the supply chain. National, regional, city and other economic data sources were used to estimate current employment levels for each sector activity. Where employment information is scarce, or where employment is estimated as a proportion of a company’s sales, a comprehensive range of case study materials are assessed to provide industry-specific ratios and benchmarks. The employment figures for LCEGSS can be used to analyse the labour intensity of economic activities across sectors.

Sales per FTE

Productivity is frequently defined as a ratio of a volume measure of output to a volume measure of input (Organisation for Economic Co-operation and Development, 2001). There are many different measures of productivity, but from the measures available in the LCEGSS dataset, we were principally able to produce a proxy measure of labour productivity based on gross output. It provides an estimate to measure how efficiently labour is combined within other factors of production. As a proxy measure of productivity, it has a number of limitations and should only be regarded as a partial measure of productivity that reflects the joint influence of a number of factors, and it should not be interpreted as the productivity of individuals in the labour force (Organisation for Economic Co-operation and Development, 2001). Although it is frequently reported as output/hour, the UK Office for National Statistics notes that labour efficiency can be measured as output/hour, output/worker and output/job (Office for National Statistics, 2017). Given the data available, output in sales revenue ($m) per job (full-time equivalent) is the most appropriate method.

Limitations

The transactional triangulation methodology is different to national statistics, but methods have been developed over time to enable it to be more comparable to traditional data sources. Constructing a definition for measurement of a new sector is complicated by differences between countries in how products and services are described and how these are assigned to industry codes. Therefore, the compilation of transactional data has to overcome variations in how the same activities are recorded in different countries and sectors. The data definition process has to identify how different descriptions vary, group those together that describe the same activities, and then create or adopt a universally applicable description to aide global data collection and reporting. Therefore the ‘language’ of LCEGSS does not map directly to any national industry descriptors, but it has wide relevance and are based on the descriptions used in industry where possible, especially in the case of more ‘mature’ sectors where an agreed language for definitions has been established.

Data collection using this methodology means that a sector definition will only include product and service activities that have a traceable economic footprint in the form of a trading history. Publicly funded or academic research and any technologies that have not yet reached the market are not included in the sector definition. This is influenced by the nature of the industry and market-focused sources accessed in the data collection process.

LCEGSS measures economic activities across existing industries and does not just measure environmental protection activities, but it does not currently measure the full extent of the ‘green economy’ in all existing industrial sectors. As noted, this is partially a consequence of the lack of consensus on how to classify varying categories of low carbon, environmental, green and sustainable economic activities that exist within individual industries. Future research aims to construct such a classification to develop a full ‘green economy’ model for data collection.

The methodology used means that LCEGSS is not an exact fit with any existing classification systems, nor particular national measurement frameworks. However, while this is a limitation in some ways (especially from the perspective of national accounting), there are advantages from a research perspective; comparison between sectors and countries is possible without the significant time or resource requirements of rewriting the national classifications or accounting systems. Data collection for LCEGSS could be described as an ‘overlay’ system that can operate above national industry classification systems to better report and analyse the green economy in the short term, without the reclassification of industrial codes required to achieve a measurable definition using industry classifications. Moreover, by using global data sources, some of the limitations of the reporting systems for smaller countries can be overcome by accessing external data and the use of internal and external data sources permits the measurement of trade flows between countries.

Calculating comparison values

GDP (nominal) data (2015 estimates) were taken from the April 2016 update of the International Monetary Fund’s World Economic Outlook. Comparisons would be different using data adjusted for purchasing power parity.

While data for many countries are available in the LCEGSS dataset, given the lack of data availability in the US and reduced discussion of the definition of the green economy in the wake of the end of the GGS survey, country data for the US was deemed to be an important focus of the study. More recently, given the revival in contemporary political debates of the concept of a ‘Green New Deal’, more up-to-date and comprehensive analysis of the green economy through the LCEGSS data could be an important contribution. Although the data was originally developed in the UK, it was decided to compare the US to China, as the other nation with a similar size of LCEGSS sales estimates, as well as the G20 and the OECD, as other important international groups of industrialised or market-orientated economies that also include the major European nations.

As the US and China are analysed and presented separately from these country groupings, the G20 comparison refers to the 19 member states of the G20, minus the US and excluding the European Union and observer country Spain. Similarly, the OECD comparison includes all states that are members of the OECD, excluding the US and China. Population data (2015 estimates) were also taken from the April 2016 update of the International Monetary Fund’s World Economic Outlook. Estimates of working age population were taken from the 2015 revision of World Population Prospects, published by the Population Division of the UN Department of Economic and Social Affairs.