Predictive Vegetation Distribution Model

We used binomial logistic regression to generate a predictive model of Araucaria forest distribution under natural conditions, with forest presence as the outcome variable and terrain parameters as predictors. The model was based on a 10 × 10 km control area south of Lages, where relatively undisturbed conditions were observed. Forest distribution was assessed from 1966 satellite imagery (CORONA) with a resolution of 4 m (available at https://earthexplorer.usgs.gov/). We manipulated the images in ArcGIS 10.2, where the Maximum Likelihood Classification tool was used to convert them into rasters of forest presence (1) or absence (0). The raster was converted into a grid of points to extract the values of the terrain predictor variables. Slope and aspect were generated from elevation data (SRTM with a resolution of 90 m available at https://earthexplorer.usgs.gov/) using the Spatial Analyst toolset in ArcGIS 10.2. Aspect was reclassified into south-facing (1) and facing any other direction (0). Topographic position index (TPI) is a measure of deviation of a terrain cell from the average elevation of its surroundings within a predetermined radius, with positive TPI indicating ridges and negative TPI indicating valleys. TPI was calculated from the SRTM raster using the Custom Land Facet Analysis toolset (available at http://www.jennessent.com/arcgis/land_facets.htm) with the recommended search radius of 5 cells. All terrain rasters were normalised so that the model could be generalised to areas with different elevation ranges from that of the control area. The logistic regression was run in R 3.3.3 and the coefficients were used in the Raster Calculator tool of ArcGIS 10.2 to generate a surface of forest probability. For the second model of Araucaria natural distribution, the TPI raster was reclassified using the recommended values of −6 for valleys and +6 for ridges (http://www.jennessent.com/arcgis/land_facets.htm). The result was added to the aspect raster. North-facing slopes and plateaus were predicted as grasslands (0) whereas south-facing slopes and valleys were coded as potential areas of forest (1) under natural conditions.

Test pits

Thirteen soil test pits were sampled in 5 cm increments, to a depth of 60 cm or bedrock. Samples from Mata Queimada and the Lages control were collected from transects across valleys, with soil profiles collected from the north-facing slope, the lower south-facing slope, the upper south-facing slope, and the plateau. Heraldo was also collected as transect, minus the forested north-facing slope due to the width of the Rio Caveiras. Single profiles were collected from Baggio and Luis Carlos on forested plateaus that are directly related to archaeological sites.

Isotopes and AMS Dating

Following protocol described by Pessenda et al.36,37, the soil organic matter from physical pre-treated bulk samples of each 5 cm increment were analysed for δ13C and Total Organic Carbon (TOC) at the Stable Isotope Laboratory of Centre for Nuclear Energy in Agriculture (CENA), University of São Paulo (Supplementary Table S4). Organic carbon is expressed as percentage of dry weight and 13C results are given as δ13C with respect to VPDB standard using the conventional δ (‰) notations:

$${\delta }^{13}{\rm{C}}(\textperthousand )=[({{\rm{R}}}_{{\rm{s}}{\rm{a}}{\rm{m}}{\rm{p}}{\rm{l}}{\rm{e}}}/{{\rm{R}}}_{{\rm{s}}{\rm{t}}{\rm{a}}{\rm{n}}{\rm{d}}{\rm{a}}{\rm{r}}{\rm{d}}}){\textstyle \,\text{-}\,}1]1000$$

where R sample and R standard are the 13C/12C ratios of the sample and standard, respectively. Analytical precision is ±0.2‰.

The 14C analyses on soil humin37 fraction were carried out by Accelerator Mass Spectrometry (AMS) by Beta Analytic. Radiocarbon ages are expressed as cal (2σ) BP (Before AD 1950) normalised to δ13C of −25‰ VPDB and as calibrate cal (2σ) BP38.

Radiocarbon dating of Soil Organic Matter (SOM) is complicated. Studies have shown that different components of SOM have different ages36,37,39. Comparison of 14C dating of humin and total soil fraction (three SOM fractions) from distinct soils, under different vegetation cover, in several regions of Brazil36,37,39, showed the insoluble humin fraction to be older, indicating contamination of total SOM by younger carbon in the fulvic and humic acids. The 14C ages of associated charcoal, in most cases, are in agreement with the humin fraction, or 20% older in average39. The test studies show that in the absence of charcoal, the humin fraction is a reliable material for 14C dating in soils. However, the humin fraction ages could be assumed as the minimum ages for carbon in soils.

The chronology for the LC core relies on three 14C dates in an age-depth model constructed in Bacon v2.2 within R. Bulk-sediment soil profile material was collected for conventional AMS radiocarbon dating. Radiocarbon ages were calibrated within Bacon using SHCal13 and modelled using Student-t test distributions with wide tails to negate the need of identifying and removing potential outliers in the age-depth model. Age-depth model mean accumulation rate priors in Bacon were calculated using the 14C chronology (acc.mean = 58 and memory priors were set slightly below default so that the model would capture accumulation rate changes driven by variable sediment delivery from the catchment (mem.strength = 2; mem.mean = 0.3). Model means and 2σ age distributions were calculated from Markov chain Monte Carlo age-depth iterations through the soil profile.

Radiocarbon Dates & SCPD

The Sum of the Calibrated Probability Distributions (SCPD or SPD) is a standard method for representing chronological trends in radiocarbon datasets. SCPDs are produced by calibrating each independent date in the sample and adding the results to produce a single density distribution. This has the advantage of including the full range of probabilities associated with calibrated dates, instead of using single point estimates40,41,42,43,44. SCPDs were built in OxCal using the Sum function and the ShCal13 calibration curve45,46. Dates clearly associated with southern Jê contexts were compiled from previously published syntheses, recently published papers, books, unpublished academic theses, and site reports (Supplementary Table S2), to which we added 42 new dates from our own field work (Supplementary Table S1). The complete dataset, before filtering, contains 245 dates from 118 archaeological sites. We applied measures of chronometric hygiene to gain more precision, overcome fluctuations in the calibration curve, and obtain a dataset that was as reliable as possible; excluding dates that: i) had a standard error equal to or larger than 100 years; ii) whose provenance was unclear; or iii) not assigned to a specific laboratory number. Additionally, to account for oversampling of some sites and phases within those sites, we applied a binning procedure41,42,43. Dates within sites were ordered and grouped into 100-year bins. Dates within each bin were merged with the R_combine function of OxCal. Timpson et al.43 found that different values for the bin-width did not affect the final shape of the SCPD. This procedure is necessary because a sum of the calibrated dates assumes that observations are independent, whereas this is not the case when multiple dates were obtained for single sites or phases within them. The final filtered dataset contained 134 dates. Despite the decrease in sample size, the filtered SCPD is highly correlated with an SCPD built with all available radiocarbon dates (Pearson’s r2 = 0.97). In addition to the SCPD, we plotted the frequencies of sites by 100 year bins using the median of the calibrated dates, each site being counted only once per time bin.

Data and materials availability

All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.