Photo-astrometric distances, extinctions, and astrophysical parameters for Gaia DR2 stars brighter than G = 18 F. Anders1,2,3, A. Khalatyan2, C. Chiappini2,3, A. B. Queiroz2,3, B. X. Santiago4,3, C. Jordi1, L. Girardi5, A. G. A. Brown6, G. Matijevič2, G. Monari2, T. Cantat-Gaudin1, M. Weiler1, S. Khan7, A. Miglio7, I. Carrillo2, M. Romero-Gómez1, I. Minchev2, R. S. de Jong2, T. Antoja1, P. Ramos1, M. Steinmetz2 and H. Enke2 A&A 628, A94 (2019) 1 Institut de Ciències del Cosmos, Universitat de Barcelona (IEEC-UB), Carrer Martí i Franquès 1, 08028 Barcelona, Spain

e-mail: fanders@icc.ub.edu

2 Leibniz-Institut für Astrophysik Potsdam (AIP), An der Sternwarte 16, 14482 Potsdam, Germany

3 Laboratório Interinstitucional de e-Astronomia – LIneA, Rua Gal. José Cristino 77, Rio de Janeiro 20921-400, Brazil

4 Instituto de Física, Universidade Federal do Rio Grande do Sul, Caixa Postal 15051, Porto Alegre 91501-970, Brazil

5 Osservatorio Astronomico di Padova, INAF, Vicolo dell’Osservatorio 5, 35122 Padova, Italy

6 Leiden Observatory, PO Box 9513 2300 RA Leiden, The Netherlands

7 School of Physics and Astronomy, University of Birmingham, Edgbaston, B15 2TT Birmingham, UK

Received: 24 April 2019

Accepted: 27 June 2019 Abstract Combining the precise parallaxes and optical photometry delivered by Gaia’s second data release with the photometric catalogues of Pan-STARRS1, 2MASS, and AllWISE, we derived Bayesian stellar parameters, distances, and extinctions for 265 million of the 285 million objects brighter than G = 18. Because of the wide wavelength range used, our results substantially improve the accuracy and precision of previous extinction and effective temperature estimates. After cleaning our results for both unreliable input and output data, we retain 137 million stars, for which we achieve a median precision of 5% in distance, 0.20 mag in V-band extinction, and 245 K in effective temperature for G ≤ 14, degrading towards fainter magnitudes (12%, 0.20 mag, and 245 K at G = 16; 16%, 0.23 mag, and 260 K at G = 17, respectively). We find a very good agreement with the asteroseismic surface gravities and distances of 7000 stars in the Kepler, K2-C3, and K2-C6 fields, with stellar parameters from the APOGEE survey, and with distances to star clusters. Our results are available through the ADQL query interface of the Gaia mirror at the Leibniz-Institut für Astrophysik Potsdam ( gaia.aip.de ) and as binary tables at data.aip.de . As a first application, we provide distance- and extinction-corrected colour-magnitude diagrams, extinction maps as a function of distance, and extensive density maps. These demonstrate the potential of our value-added dataset for mapping the three-dimensional structure of our Galaxy. In particular, we see a clear manifestation of the Galactic bar in the stellar density distributions, an observation that can almost be considered direct imaging of the Galactic bar. Key words: stars: fundamental parameters / stars: distances / stars: statistics / dust / extinction / Galaxy: stellar content / Galaxy: structure

© ESO 2019

1. Introduction

Galactic astrophysics is currently in a similar phase as geography was in the 15th century: large parts of the Earth were unknown to contemporary scientists, only crude maps of most of the known parts of the Earth existed, and even the orbit of our planet was still under debate. Nowadays, major parts of the Milky Way are still hidden by thick layers of dust, but we are beginning to discover and to map our Galaxy in a much more accurate fashion by virtue of dedicated large photometric, astrometric, and spectroscopic surveys.

In this context, the astrometric European Space Agency mission Gaia (Gaia Collaboration 2016) represents a major leap in our understanding of the Milky Way’s stellar content: its measurement precision as well as the absolute number counts surpass previous astrometric datasets by several orders of magnitude. The recent Gaia Data Release 2 (Gaia DR2; Gaia Collaboration 2018b), covered the first 22 months of observations (from a currently predicted total of approximately ten years) with positions and photometry for 1.7 × 109 sources (Evans et al. 2018), proper motions and parallaxes for 1.3 × 109 sources (Lindegren et al. 2018), astrophysical parameters for ≃108 stars (Andrae et al. 2018), and radial velocities for 7 × 106 of them (Sartoretti et al. 2018; Katz et al. 2019).

The Gaia DR2 dataset thus represents a treasure trove for many branches of Galactic astrophysics. Various advances have since been achieved in the field of Galactic dynamics (e.g. Gaia Collaboration 2018c,d; Antoja et al. 2018; Kawata et al. 2018; Quillen et al. 2018; Ramos et al. 2018; Laporte et al. 2019; Monari et al. 2019; Trick et al. 2019), star clusters and associations (e.g. Gaia Collaboration 2018a; Cantat-Gaudin et al. 2018a,b,c; Castro-Ginard et al. 2018; Soubiran et al. 2018; Zari et al. 2018; Baumgardt et al. 2019; Bossini et al. 2019; de Boer et al. 2019; Meingast & Alves 2019), the Galactic star-formation history (Helmi et al. 2018; Mor et al. 2019), hyper-velocity stars (e.g. Bromley et al. 2018; Scholz 2018; Shen et al. 2018; Boubert et al. 2018, 2019; Erkal et al. 2019), among others. Apart from stellar science, the precise Gaia DR2 photometry, in combination with the high quality of the stellar parallax measurements, can also be used to map the distribution of dust in the Galaxy. The availability of precise individual distance and extinction determinations (mainly from high-resolution spectroscopic surveys, and also recently from Gaia) has led to a significant improvement of interstellar dust maps within the past years and months (e.g. Lallement et al. 2014, 2018, 2019; Green et al. 2015; Capitanio et al. 2017; Rezaei Kh. et al. 2017, 2018; Yan et al. 2019; Leike & Enßlin 2019; Chen et al. 2019).

In addition to the main Gaia DR2 data products (parallaxes, proper motions, radial velocities, and photometry), the Gaia DR2 data allowed for the immediate computation of quantities relevant for Galactic stellar population studies. These are the Bayesian geometric distance estimates computed by Bailer-Jones et al. (2018) and the first stellar parameters and extinction estimates from the Gaia Apsis pipeline (Andrae et al. 2018). The latter authors deliberately used only Gaia DR2 data products to infer line-of-sight extinctions as well as effective temperatures, radii, and luminosities. This proved to be a difficult exercise since the three broad Gaia passbands contain little information to discriminate between effective temperature and interstellar extinction. As a result, the Apsis T eff estimates were obtained under the assumption of zero extinction (thus suffering from systematics in the Galactic plane) and the uncertainties in individual G-band extinction and E(G BP − G RP ) colour excess estimates are so large that these values should only be used in ensemble studies (Andrae et al. 2018; Gaia Collaboration 2018b).

The lack of more precise extinction estimates prevented the use of Gaia data for stellar population studies in a larger volume outside the low-extinction regime (Gaia Collaboration 2018a; Antoja et al. 2018). Many of the new Galactic archaeology results derived from Gaia DR2 still concentrate on a small portion of the Gaia data. This is partly due to the necessity of full phase-space information (Gaia Collaboration 2018d; Antoja et al. 2018), but also partly due to extinction uncertainties hampering the direct inference of desired quantities (Gaia Collaboration 2018a; Helmi et al. 2018; Romero-Gómez et al. 2019; Mor et al. 2019).

In this spirit, the aim of this paper is to enlarge the volume in which we can make use of the Gaia DR2 data by providing more accurate and precise extinctions and stellar parameters (most importantly T eff , but also estimates of surface gravity, metallicity, and mass), and more accurate distances for distant giant stars. Although the data quality degrades notably around a magnitude of G ∼ 16.5, we provide useful information for considerable fraction of stars down to G = 18. To this end, we use the python code StarHorse , originally designed to determine stellar parameters and distances for spectroscopic surveys (Santiago et al. 2016; Queiroz et al. 2018)1. Of the 285 million objects with G ≤ 18 contained in Gaia DR2, our code delivered results for ∼266 million stars. Applying a number of conservative quality criteria on the input and output data, we achieve a sample cleaned on the basis of data quality flags (see Sect. 3.4) of around 137 million stars with reliable stellar parameters, distances, and extinctions.

The paper is structured as follows: Sect. 2 presents the input data used in the parameter estimation. The following Sect. 3 describes the basics of our code, focussing on updates with respect to its previous applications to spectroscopic stellar surveys. Section 3.4 in particular explains how we flagged the StarHorse results for Gaia DR2. Since we decided to provide results for all objects that our code converged for, any user of our value-added catalogue should pay particular attention to this subsection. We present some first astrophysical results in Sect. 4, mainly focussing on extinction-corrected colour-magnitude diagrams, stellar density maps, extinction maps, and the emergence of the Galactic bar. We discuss the precision and accuracy of the StarHorse parameters in Sect. 5, providing comparisons to open clusters and stellar parameters obtained from high-resolution spectroscopy. We also compare to previous results obtained from Gaia DR2 in Sect. 6. We conclude the paper with a summary and a brief outlook on possible applications of StarHorse or similar codes to future Gaia data releases.

2. Data

The Gaia satellite is measuring positions, parallaxes, proper motions and photometry for well over 109 sources down to G ≃ 20.7, and obtaining physical parameters and radial velocities for millions of brighter stars. Particularly important for our purposes are the parallaxes, whose precision varies from < 0.1 mas for G ≤ 17 to ≃0.7 mas for G = 20 (Lindegren et al. 2018). Initial tests showed that reliable StarHorse results (that represent an improvement with respect to purely photometric distances) can be obtained up to G ∼ 18. We therefore downloaded Gaia DR2 data for all stars with measured parallaxes up to that magnitude.

It is well known that the parallaxes delivered by Gaia DR2 are not entirely free from systematics (Gaia Collaboration 2018b; Lindegren et al. 2018; Stassun & Torres 2018; Zinn et al. 2019; Khan et al. 2019)2. In particular, Arenou et al. (2018) have shown that the parallax zero-point is subject to a sub-100 μas offset depending on position, and possibly magnitude, parallax, and/or colour. Since our distance inference depends critically on the accuracy of the input parallaxes, but the positional dependence is too complex to calibrate out at the moment, we opted for the following first-order calibrations detailed in Table 1: in the bright regime (G < 14), we apply a correction of +0.05 mas similar to the global offset found by Zinn et al. (2019) and Khan et al. (2019) from asteroseismic and spectroscopic observations in the Kepler field. It should be noted, however, that Khan et al. (2019), in agreement with the quasar comparison shown in Arenou et al. (2018), find different offsets for the Kepler-2 fields C3 and C6, indicating that also in the bright regime the parallax zero-point depends on sky position. In the faint regime (G > 16.5), we use the +0.029 mas correction derived by Lindegren et al. (2018) from AllWISE quasars. For intermediate G magnitudes, the parallax correction is linearly interpolated between these two values.

Lindegren et al. (2018), Arenou et al. (2018), and others have demonstrated that, similar to the Gaia DR2 parallaxes, also the parallax uncertainties are prone to moderate systematics, in the sense that they are typically slightly underestimated. For this work (see Table 1) we follow a modified version of the recalibration advertised by Lindegren (2018): in the faint regime (G > 15), the external-to-internal uncertainty ratio exponentially drops to 1.08, while at the bright end (G < 12) this factor is set to 1.2. In the intermediate regime, we again opt for linear interpolation, a choice that is supported by the data presented by Lindegren (2018, slide 15).

We note that this re-scaling of the parallax errors takes into account the systematic term σ s (which roughly accounts for the variations of the parallax zero-point over the sky, with magnitude, colour etc.; Eq. (2) in Lindegren 2018) only approximately. By choosing the recalibration detailed in Table 1 we have effectively accounted for σ s in the faint regime. In the bright regime, our recalibrated parallax uncertainties are slightly lower than in the Lindegren model (for bright stars our minimum parallax error is 0.018, below the systematic floor of σ s = 0.021 proposed by Lindegren). While this will be corrected in future runs, we have verified that the results change very little when correctly including the σ s term.

Apart from the parallaxes, we also make use of the three-band Gaia DR2 photometry (G, G BP , G RP ). While these are of an unprecedented precision, several recent works (Weiler 2018; Maíz Apellániz & Weiler 2018; Casagrande & VandenBerg 2018) have shown by comparison with absolute spectrophotometry that the G band suffers from a magnitude-dependent offset, and that the nominal passbands need to be slightly corrected. Therefore, in order to compare the Gaia DR2 G magnitudes to the synthetic Gaia DR2 photometry from stellar models, we have applied the G magnitude corrections, as well as the new passband definitions, given by Maíz Apellániz & Weiler (2018).

Furthermore, we supplement the Gaia data with additional Pan-STARRS1 grizy (Scolnic et al. 2015), 2MASS JHK s , and AllWISE W1W2 photometry, using the cross-matches provided by the Gaia team (Gaia Collaboration 2018b; Marrese et al. 2019). After initial tests, we only used Pan-STARRS1 photometry for stars with magnitudes fainter than G = 14 that do not suffer from saturation problems. For all passbands, missing photometric uncertainties were substituted by fiducial maximum uncertainties of 0.3 mag. We also introduced an error floor of 0.04 mag. For Gaia, 2MASS, and AllWISE, we use an uncertainty floor of 0.03 mag, which can be considered a minimum value for the accuracy of the synthetic photometry used by our method. We verified that this choice does not impact our results.

3. StarHorse runs

3.1. The code

The advent of massive multiplex spectroscopic stellar surveys has led to the development of a growing number of codes that aim to determine precise distances and extinctions to vast numbers of field stars (for example, Breddels et al. 2010; Zwitter et al. 2010; Burnett & Binney 2010; Binney et al. 2014; Santiago et al. 2016; Wang et al. 2016; Mints & Hekker 2018; Das & Sanders 2019; Leung & Bovy 2019).

The StarHorse code (Queiroz et al. 2018) is a Bayesian parameter estimation code that compares a number of observed quantities (be it photometric magnitudes, spectroscopically derived stellar parameters, or parallaxes) to stellar evolutionary models. In a nutshell, it finds the posterior probability over a grid of stellar models, distances, and extinctions, given the set of observations plus a number of priors. The priors include the stellar initial mass function (in our case Chabrier 2003), density laws for the main components of the Milky Way (thin disc, thick disc, bulge, and halo), as well as broad metallicity and age priors for those components. We refer to Queiroz et al. (2018) for more details. In this work we also used a broad top-hat prior on extinction (−0.3 ≤ A V ≤ 4.0) for stars with low parallax signal-to-noise ratios ( ), ensuring the convergence of the code. This should be kept in mind when interpreting our results for highly extincted stars in the inner Galaxy. The impact of our choice of the priors on the results for the inner regions of the Galaxy are studied in more detail in Queiroz et al. (in prep.).

The first version of the code was developed by Santiago et al. (2016) in the context of the RAVE survey (Steinmetz et al. 2006) and the SDSS-III (Eisenstein et al. 2011) spectroscopic surveys SEGUE (Yanny et al. 2009) and APOGEE (Majewski et al. 2017). In Queiroz et al. (2018) the code was ported to python 2.7 and made more flexible in the choice of input, priors, etc. With respect to that publication, we have implemented some important changes that were necessary to apply StarHorse to the huge Gaia DR2 dataset.

3.2. Code updates and improvements

With respect to Queiroz et al. (2018), a few updates to the StarHorse code have been carried out. Most importantly, we now take better account of dust extinction when comparing synthetic and observed photometry, an update that was necessary due to the use of the broad-band optical Gaia passbands.

Dust-attenuated synthetic photometry. As explained in Holtzman et al. (1995), Sirianni et al. (2005), or Girardi et al. (2008), dust-attenuated photometry of very broad photometric passbands (such as the Gaia DR2 ones) should take into account that the passband extinction coefficient A i /A V for a star varies as a function of its source spectrum F λ (most importantly its T eff ) as well as extinction A V 3 itself:

(1)

Here, is the transmission curve, and a λ is the extinction law. Therefore, one has to compute the coefficients A i /A V for each stellar model and each extinction value considered. In most of the recent literature concerning stellar distances, this effect is not taken into account, because for narrow-band and infra-red passbands, the extinction coefficient is roughly constant. For the Gaia passbands, however, this is not the case any more (Jordi et al. 2010). In the new version of StarHorse we therefore use the Kurucz grid of synthetic stellar spectra (Kurucz 1993)4 to compute a grid of bolometric corrections as a function of T eff and A V for each passband, and for our default extinction law (Schlafly et al. 2016).

Additional output. While Queiroz et al. (2018) used spectroscopically determined stellar parameters as input and therefore only reported distances and extinctions (and in the case of high-resolution spectroscopy also masses and ages; e.g. Anders et al. 2018), the absence of spectroscopically determined effective temperatures, gravities, and metallicities in the case of Gaia+photometry data led to the decision to also report the posterior values of T eff , log g, and [M/H]. Since the photometric estimates for log g, [M/H], and stellar mass are of significantly lower precision, we regard these as secondary output parameters, in contrast to the primary output parameters d, A V , and T eff . The secondary parameters were mainly obtained to test the targeting strategy of the 4MOST low-resolution disc and bulge survey (4MIDABLE-LR; Chiappini et al. 2019), and the functionality of the 4MOST simulator (4FS; see de Jong et al. 2019). Furthermore, in addition to the V-band extinction values A V , we also provide median extinction values in the Gaia DR2 passbands G, G BP , and G BP , as well as extinction-corrected absolute magnitude M G 0 , and dereddened colour (G BP − G BP ) 0 .

Computational updates. Since Queiroz et al. (2018), the StarHorse code was migrated python 2.7 to python 3.6 and runs on the newton cluster at the Leibniz-Institut für Astrophysik Potsdam (AIP). Due to several improvements in the data handling, the runtime was reduced by a factor of 6 as compared to the previous version used in Queiroz et al. (2018).

3.3. StarHorse setup

We then ran StarHorse code (Santiago et al. 2016; Queiroz et al. 2018). In this work we used a grid of PARSEC 1.2S stellar models (Bressan et al. 2012; Chen et al. 2014; Tang et al. 2014) in the 2MASS, Pan-STARRS1, Gaia DR2 rederived (Maíz Apellániz & Weiler 2018), and WISE photometric systems available on the CMD webpage maintained by L. Girardi5. For G ≥ 14, we use a model grid equally spaced by 0.1 dex in log age as well as in metallicity [M/H]. Due to the higher precision of the Gaia DR2 parallaxes for G < 14, we used a finer grid with 0.05 dex spacing in the bright regime.

For computational reasons, depending on the parallax quality we used different ways to construct the range of possible distance values: for stars with well-determined parallaxes ( ), we required the distances to lie within . For stars with less precisely measured parallaxes, we used their G magnitudes to constrain the distance range for each possible stellar model (for details, see Queiroz et al. 2018).

For the case of Gaia DR2 run (i.e. in absence of spectroscopic data), the code took 1 s per star to run on the coarse grid (G > 14, 270M stars), and 20 s per star on the fine grid (G ≤ 14, 16M stars). In total, the computational cost for this StarHorse run thus was ∼164 000 CPU hours (19 years on a single CPU). The global statistics for our output results are summarised in Table 2 and discussed in detail in Sect. 4.

3.4. Input and output flags

Along with the output of our code (median statistics of the marginal posterior in distance, extinction, and stellar parameters), we provide a set of flags to help the user decide which subset of the data to use for their particular science case. These flags correspond to the following columns.

This flag describes the overall astrometric and photometric quality of the Gaia DR2 data for each star in a three-digit flag (similar to the Gaia DR2-native priam_flag 6). Balancing simplicity and the recommendations of Lindegren et al. (2018) and Lindegren (2018), we limit this flag to the following three digits:

Renormalised unit weight error flag: Lindegren (2018) recently showed that instead of following the astrometric quality requirements used by Gaia Collaboration (2018b), Lindegren et al. (2018), and Arenou et al. (2018), similar or better cleaning of spurious Gaia DR2 astrometry can be obtained by requiring a maximum value for the so-called renormalised unit weight error ( ruwe ). We therefore defined the first digit as follows: IF ruwe < 1.4 THEN 0 ELSE 1 Colour excess factor flag: Evans et al. (2018) and Arenou et al. (2018) recommend the use of the phot_bp_rp_excess_factor to flag spurious Gaia DR2 photometry. We follow their recommendation and define the second digit as: IF G BP − G RP IS NULL THEN 2 ELIF 1.0 + 0.015 · (G BP − G RP )2 < phot_bp_rp_excess_factor < 1.3 + 0.060 · (G BP − G RP )2 THEN 0 ELSE 1 Variability flag: The third digit equals the Gaia DR2-native phot_variable_flag .

The human-readable SH_PHOTOFLAG input flag details which combination of photometric data (Gaia, Pan-STARRS1, 2MASS, WISE) was used as input for StarHorse . For example, if photometry in all passbands was available for a star, the SH_PHOTOFLAG entry reads GBPRPgrizyJHKsW1W2 . If only Gaia DR2 G and Pan-STARRS1 izy magnitudes were available, the flag reads Gizy . In addition, in the rare case that no uncertainty for a particular photometric band was available from the original catalogue and the fiducial uncertainty of 0.3 mag was used instead (see Sect. 2), we added a “ # ” to the corresponding passband. For example, if a star has complete Gaia DR2, Pan-STARRS1, and WISE photometry, but the r and W2 magnitudes come without uncertainties, then the SH_PHOTOFLAG entry would be GBPRPgr#izyW1W2# .

The SH_PARALLAXFLAG input flag informs about the precision of the Gaia DR2 parallaxes (accounting for zero-point shift and uncertainty corrections; see Table 1). For (parallaxes better than 20%), the flag reads gtr5 , else leq5 . As explained in Sect. 3.3, this has consequences for the construction of the posterior PDF in the StarHorse code: if the parallax is precise, then the range of possible distances is computed directly from the parallax itself (allowing for 4σ deviations). On the other hand, if only uncertain parallaxes are available, the range of possible distance moduli is constructed based on the measured G magnitude. We verified that this choice does not produce different results for stars near the decision boundary (the rupture in Fig. 1 does not occur at the decision boundary , but at ≃1./0.22 ≃ 4.55, which is where the standard deviation of the inverse parallax PDF increases sharply; see Bailer-Jones 2015, Sect. 4.1).

The StarHorse output flag, similar to SH_GAIAFLAG , consists of several digits that inform about the fidelity of the StarHorse output parameters.

Main StarHorse reliability flag: If this digit equals to 1, then the star has a very broad distance PDF: IF 0.5 · ( dist84 − dist16 )/ dist50 < We justify this definition, a cut in the posterior log g vs. distance plane, in Appendix A. The essence of this definition is that median statistics of the posterior parameters for stars where this digit equals to 1 should be treated with utmost care, as their combination often yields unphysical results. For instance, some stars fall in places of the extinction-corrected CMD that is inconsistent with any stellar model (due to complex multi-modal PDFs; see Appendix B), meaning that their median posterior absolute magnitude, distance, and extinction should not be used together (see for instance the unphysical “nose” feature between the main sequence and the red-giant branch in Fig. 4, bottom right panel). We verified that this effect only occurs for faint stars with very uncertain parallaxes ( ) – which is when the PDF of inverse parallax becomes very noisy and biased (see Fig. 1 and Bailer-Jones 2015; Astraatmadja & Bailer-Jones 2016a; Luri et al. 2018). This results in a poor discrimination between dwarfs and giants for these typically faint (G ≳ 16.5; see Fig. 2) stars. Although their median effective temperatures and extinctions may still be useful, their median 1D distances and other parameters should not be used. We discuss the issue in more detail in Appendix A. In future StarHorse runs we will resort to a more sophisticated treatment of multimodal posterior PDFs. Large distance flag: For some stars (especially extragalactic objects that are still bright enough to be in Gaia DR2, such as stars in the Magellanic Clouds or the Sagittarius dSph), StarHorse delivers very large posterior distances, many of which are likely affected by significant biases due to the dominance of the Galactic prior used to infer them: IF dist50 < 20 THEN 0 ELIF dist50 < 30 THEN 1 ELSE 2 Unreliable extinction flag: Significantly negative extinctions, or A V values close to the prior boundary at A V = 4 should be treated with care: IF ( AV95 > 0 AND AV95 < 3.9 THEN 0 ELIF AV95 < 0 THEN 1 ELIF AV84 < 3.9 THEN 2 ELSE 3 Large A V uncertainty flag: Very large extinction uncertainties point to either incomplete or very uncertain input data: IF 0.5 · ( AV84 − AV16 ) < 1 THEN 0 ELSE 1 Very small uncertainty flag: Very small posterior uncertainties are most likely underestimated and indicate poor StarHorse convergence (either due to inconsistent input data or too coarse model grid size). These results should therefore also be used with care. The definition is as follows: IF 0.5* (dist84-dist16)/dist50 < 0.001 OR 0.5*( av84−av16 ) < 0.01 OR 0.5*( teff84−teff16 ) < 20. OR 0.5*( logg84−logg16 ) < 0.01 OR 0.5*( met84−met16 ) < 0.01 OR 0.5* (mass84−mass16)/mass50 < 0.01 THEN 1 ELSE 0.

3.5. Data access

StarHorse delivered distances and extinctions for 265 637 087 objects, of which 151 506 183 pass the post-calculo quality flags included in SH_OUTFLAG , and 136 606 128 stars pass both the SH_OUTFLAG as well as the SH_GAIAFLAG that includes the recent recommendations of Lindegren et al. (2018). For clarity, all our calibration choices are listed in Table 1. The main statistics are summarised in Table 2. Our results, together with documentation, can be queried via the AIP Gaia archive at gaia.aip.de . Example queries can be found in Appendix D. In addition, the output files are available for download in HDF5 format at data.aip.de . The digital object identifier for this dataset is doi:10.17876/gaia/dr.2/51

4. StarHorse Gaia DR2 results

4.1. Summary

Table 2 summarises the results of the present StarHorse run for Gaia DR2 and puts them in context with previous results available from the literature (three references for distances and extinctions for Gaia stars observed by spectroscopic surveys, as well as the two only studies that attempted to determine distances and extinctions, respectively, for the whole Gaia DR2 dataset). In particular, the table informs about sample sizes, magnitude ranges, and the typical precision in the primary output parameters d, A V , and T eff . In this table, we also define some useful sub-samples of the Gaia DR2 StarHorse data (identified by colour in some of the subsequent plots) that are used throughout this paper. These are:

stars with (recalibrated) parallaxes more precise than 20% (blue colour; 39% of the converged stars), stars with SH_GAIAFLAG equal to “000” (cyan colour; 88% of the converged stars), stars with SH_OUTFLAG equal to “00000” (orange colour; 57% of the converged stars), stars with SH_OUTFLAG equal to “00000” and SH_GAIAFLAG equal to “000” (red colour; 52% of the converged stars).

The G magnitude distribution for each of these sub-samples is shown in Fig. 2. In this paper, we will mainly concentrate on the “both-flags”-cleaned sample.

Figure 3 presents the output of the StarHorse code for the Gaia DR2 sample in one plot. The figure displays the distributions and correlations of the median StarHorse primary output parameters T eff , d, and A V , and their respective uncertainties, as well as G magnitude and parallax signal-to-noise ratio. The grey contours in this plot refer to all converged stars, whereas the red contours emphasise the results for the stars with both SH_GAIAFLAG and SH_OUTFLAG equal to “00000”. For a plot including also the secondary output parameters log g, [M/H], and M * , we refer to Fig. B.1.

The panels in the diagonal row of Fig. 3 provide the one-dimensional distributions in G magnitude, parallax signal-to-noise, the median output parameters, and the distributions of the corresponding uncertainties (in logarithmic scaling) as area-normalised histograms. Each of the panels also illustrates the effect of applying the recommended flags: the red uncertainty distributions are typically confined to smaller values than the faint grey ones.

The off-diagonal plots of Fig. 3 show the correlations between the output parameters. We observe complex structures in many of these panels, most of which are due to physical correlations stemming from stellar evolution or selection effects. For example, the strong bimodality between giants and dwarfs in the log g vs. T eff diagram (see third column, second panel from top in Fig. B.1) is reflected in many of the panels, most notably the distance distribution (fourth column). In addition, we note that some of the complex structure disappears when the flag cleaning is applied to the data. Different behaviour of the red and grey distributions in some panels should warn the user about potentially spurious correlations that may appear when using the full StarHorse sample.

4.2. Extinction-cleaned CMDs

As a first sanity check, in Fig. 4 we present StarHorse -derived Gaia DR2 colour-magnitude diagrams (CMDs) for the full converged sample (i.e. excluding mostly white dwarfs and galaxies) in four magnitude bins. Focussing first on the top left panel (G < 14), we note very well-defined features of stellar evolution the CMD: a thin main sequence (broadening in the very blue and very red regimes), a well-populated sub-giant branch, as well as a very thin red clump, the red giant branch, and the asymptotic giant branch. We also notice more subtle features such as the red-giant bump or the secondary red clump.

As we move to fainter magnitude bins, the number of objects grows, but also the typical uncertainty in the main input parameter parallax, resulting in a gradual broadening of the sharp stellar-evolution features observed in the top left panel of Fig. 4. In the lower left panel, for example, we begin to note some additional features that are not directly related to stellar evolution. For example, the almost vertical arm at (BP − RP) 0 below the main sequence is related to problematic astrometry (large ruwe values). Furthermore, the discrete stripes in the red main sequence are related to the finite mass, age, and metallicity resolution of our stellar model grid used. Some other unphysical structures, such as the nose between the main sequence and the giant branch, are induced by poor convergence of StarHorse (see Sect. 3.4.4). The higher relative number of giant stars in the fainter magnitude bins with respect to the G < 14 sample is an effect of stellar population sampling.

Figure 5 shows another collection of StarHorse CMDs, now highlighting the sub-samples defined in Table 2. As discussed in Sect. 3.4.4, the full StarHorse sample occupies a larger volume in the CMD, including an unphysical region in-between the main sequence and the red-giant branch that is due to stars with poorly determined parallaxes. These stars disappear when applying the SH_OUTFLAG (orange dots in lower middle panel), and a further cleaning using the SH_GAIAFLAG results in a nice physical CMD for 129 million stars (lower right panel of Fig. 5).

Comparing the upper right and lower right panel of Fig. 5, we see that the number of red giants in the latter is much higher, leading to a slight broadening of the RC locus and a substantially higher number of AGB stars. This is due to the fact that StarHorse is able to determine still surprisingly precise (∼30%) photo-astrometric distances for giants with poor parallax measurements.

4.3. Kiel diagrams

Figure 6 shows Kiel diagrams (log g vs. T eff ) using the median posterior StarHorse results, for the full sample of converged stars and for the flag-cleaned sample defined in Table 2. The middle column of that figure show the median distance and median A V extinction in each pixel of the Kiel diagram, respectively. The right column show their respective uncertainties in each pixel. The complex dependence of the uncertainties on the stellar parameters reflects the abrupt decrease in precision below seen in Fig. 1.

4.4. Stellar density maps and the emergence of the Galactic bar

Figure 7 presents four projections of the stellar density distribution in Galactocentric co-ordinates for the flag-cleaned sample. The solar position (in kpc) is at (X Gal , Y Gal , Z Gal )=(8.2, 0, 0.025). The figure emphasises the loss of stars near the Galactic midplane towards the inner Galaxy, which is due to both the high dust extinction affecting the Gaia selection function, and the low number of stars that pass the flag quality criteria in these regions. Several conclusions can be drawn from Fig. 7, as we describe next.

As pointed outed by Bailer-Jones (2015), Luri et al. (2018), for example, the naive 1/ϖ estimator provides biased distances, especially in the case of low parallax precision, extending the observed volume to unplausibly large distances. On the other hand, the exponentially decreasing density prior recently used by Bailer-Jones et al. (2018) is more apt for main-sequence stars and tends to underestimate the distances to distant luminous giant stars. The StarHorse results for those stars, taking into account photometric information as well as more complex priors, show for the first time that Gaia DR2 already allows us to probe stellar populations in the bulge and beyond. A detailed comparison with Bailer-Jones et al. (2018) is presented in Sect. 6.1.

The clearest novel feature of the StarHorse density map shown in Fig. 7 is the presence of a stellar overdensity coinciding with the expected position of the Galactic bar, inclined by about 40° with respect to the solar azimuth, and with a semi-major axis of about 4 kpc. This almost direct detection of the Galactic bar is confirmed with StarHorse distances for APOGEE stars and discussed in detail in a separate paper (Queiroz et al., in prep.). The significance of the result lies in the fact that although we are using a prior for the Galactic bulge-bar (Robin et al. 2012), its shape and inclination angle are quite different from our bar prior (see Fig. 8), even when invoking an interplay with possible observational biases.

The presence of the Galactic bar in the Gaia DR2 data is even more prominent when we focus only on the red-clump stars. Figure 8 shows the resulting density map when selecting flag-cleaned RC stars close to the Galactic plane (|Z Gal |< 3 kpc) from the StarHorse Kiel diagram (4500 K < teff50 < 5000 K, 2:35 < logg50 < 2.55, − 0.6 < met50 < 0.4). The density contrast of the RC bar with respect to the RC population in front of the bar amounts to almost 50. This could in fact be a physical feature of the Galactic disc: the RC is a tracer of the young-to-intermediate age population (∼1 − 4 Gyr; e.g. Girardi 2016), and the star-formation history in the inner disc outside the bar region is still poorly constrained. It is more likely, however, that the observed shape of the bar (and especially the density drop in front of it) in Fig. 8 is a combined effect of the Gaia DR2 selection function, the stellar density profile of the inner disc, our adopted bulge prior, and the quality flag cuts used to produce Fig. 8. At the present stage, we therefore caution the reader not to take the star count numbers in this map at face value, and refer to Queiroz et al. (in prep.) for a more in-depth discussion.

The lower right panel of Fig. 7 shows the density map in Galactocentric cylindrical co-ordinates R Gal vs. Z Gal . Especially in this panel we note two overdensities in the direction of the Magellanic Clouds. These are mostly composed of stars belonging to the Clouds that have been forced to smaller distances by our Milky Way prior (which does not contain any extragalactic stellar population, only a smooth halo with a power-law density). The results for these stars have not been excluded from our analysis, but should be used with caution. The same is true for other nearby galaxies with resolved stellar populations, such as the Sagittarius dSph, Fornax, etc.

4.5. Kinematic maps

Several studies have already used our distances for the Gaia DR2 sub-sample of stars with radial velocity measurements (Katz et al. 2019) in kinematic analyses of the Galactic disc. Quillen et al. (2018) used our results to study the arches and ridges in velocity space found by Gaia Collaboration (2018d), Antoja et al. (2018) and Kawata et al. (2018), attributing some of them to stellar orbit crossings with spiral arms. Monari et al. (2018) used our distances to counter-rotating stars in the Galactic halo to measure the escape speed curve and the mass of the Milky Way. Recently, Carrillo et al. (2019) used StarHorse distances together with Gaia DR2 positions, proper motions, and line-of-sight velocities, to study the 3D velocity distribution in the Milky Way disc. They confirmed the bulk vertical motions see in earlier data, consistent with a combination of breathing and bending modes, and identified a strong radial V R gradient in the Galactic inner disc, transitioning smoothly from 15 km s−1 kpc−1 at Galactic azimuth Φ Gal ∼ 50 deg to −15 km s−1 kpc−1 at Galactic azimuth Φ Gal ∼ − 50 deg. Our StarHorse results were essential for this type of work, since they enabled the authors to probe much farther heliocentric distances.

To further illustrate the accuracy of our distances for distant red-giant stars, in Fig. 9 we show a proper-motion map of the disc red-clump sample used in Fig. 8 and Sect. 4.4. Both panels of Fig. 9 show proper motion in Galactic longitude corrected for the solar motion, μ l, LSR , as a function of Galactic position. The top panel shows the StarHorse red-clump stars, while the bottom panel shows the red-giant sample studied by Romero-Gómez et al. (2019). For comparability, we assume the same values for the solar motion (U ⊙ = 11.1 km s−1, V ⊙ = 12.24 km s−1; Schönrich et al. 2010) and the distance to the Galactic Centre (8.34 kpc; Reid et al. 2014) as in Romero-Gómez et al. (2019), although the residual small-scale dipole variations close to the solar position suggest that the solar motion correction may have to be slightly revised.

The study of Romero-Gómez et al. (2019) concerned the morphology and kinematics of the Galactic warp, so it mainly focussed on the motions perpendicular to the Galactic disc, μ b . Here we show that the μ l map of the RGB sample in the bottom panel of Fig. 8 compares quite well to our red-clump sample shown in the top panel. Since here we are more interested in the possible kinematic effects of the Galactic bar, we can now study the bulk motions in the Galactic plane out to larger distances from the Sun, using a cleaner sample of RC stars. We highlight several dynamical features present in this sample.

The prominent symmetric arc features around the solar position towards the outer and inner disc are produced by the Galactic rotation curve, and follow the overall expected trends (see e.g. Fig. 3 in Brunetti & Pfenniger 2010 for a prediction of the μ l map for an axisymmetric disc). It is interesting to see that the proper motion contours in the inner disc coincide with the angle of the Galactic bar (defined by stellar density). Qualitatively this coherent motion seen in the region of the bar agrees with earlier predictions by Brunetti & Pfenniger (2010, their Fig. 8) and the disc red-clump test particle simulations of Romero-Gómez et al. (2015).

A more quantitative comparison to kinematic Galactic models including effects of the Galactic bar is left to future studies.

4.6. Extinction maps

Figures 10 and 11 show StarHorse -derived two-dimensional (2D) median extinction maps. Figure 10 shows the all-sky A V map in Aitoff projection. The overall appearance of this figure compares very well to the expected 2D extinction map (e.g. Andrae et al. 2018, Fig. 21; Lallement et al. 2018, Fig. 6).

In Fig. 11, we show median extinction maps in four distance bins between 300 and 1500 pc, for the Orion region. The four panels show how with increasing distance extinction from molecular clouds gradually fills the Galactic plane. In principle, our results can thus be used to construct 3D extinction maps (e.g. Schlafly & Finkbeiner 2011; Green et al. 2015) and infer the three-dimensional dust distribution in the extended solar vicinity (e.g. Capitanio et al. 2017; Rezaei Kh. et al. 2018; Lallement et al. 2019; Zucker et al. 2019).

5. Precision and accuracy

5.1. Overall precision

Figure 12 shows the median relative uncertainties in the StarHorse output parameters as a function of Gaia DR2 G magnitudes. In all panels, we again show the results for all converged stars (in black, as before), and for the flag-cleaned sample (in red, as before). The other coloured lines shown in Fig. 12 demonstrate the precision improvement obtained by adding more photometric data to the Gaia DR2 data. The blue curves denote the running median uncertainty for stars with only Gaia DR2 photometry, while the other coloured lines refer to stars for which other data are available (as indicated in the legend in the middle panels). The green curve stands for the stars with complete photometric information ( SH_PHOTOFLAG ==“GBPRPgrizyJHKsW1W2”).

Uncertainties in most quantities increase with G, as expected, due to the increasing uncertainties in the astrometry and photometry (we note the logarithmic y-axis in all panels of Fig. 12, except the top middle). The fundamental determinant for the distance precision (as well as for most of the other StarHorse output parameters) is of course not the magnitude itself, but the parallax signal-to-noise ratio (e.g. Bailer-Jones et al. 2018, see also Fig. 1). The complex correlations between the output parameters and their uncertainties are shown in Fig. 3 for the primary output parameters. For a global picture of the parameter and uncertainty trends including the secondary output parameters, we refer to Fig. B.1. For the sake of brevity, however, here we focus our discussion mainly on the median uncertainty trends with G magnitude (which is correlated with ϖ/σ ϖ ) shown in Fig. 12.

The distance precision plot (top left panel of Fig. 12) deserves some further discussion. To begin with, in the bright regime (G DR2 < 14, including the radial-velocity sub-sample of 7 × 106 stars), the vast majority of stars have uncertainties of 8% or less in distance, as expected from the exquisite parallax quality of Gaia DR2 (Lindegren et al. 2018; Arenou et al. 2018). Focussing on the orange and blue lines of this plot, it may be surprising that the addition of 2MASS magnitudes to the input data seems to worsen the distance precision. In fact, the most precise distances for stars with G < 12.5 are obtained when only using Gaia DR2 data. This observation points to a tension between the 2MASS and the Gaia DR2 data: The range of acceptable distances for these stars is precisely determined by their measured parallax (we assume the parallax offset to be fixed and only a function of magnitude; see Table 1); so that the three Gaia DR2 passbands alone already constrain the space of possible stellar parameters and extinctions. The three 2MASS magnitudes alone also constrain effective temperature and extinction, so if these two independent constraints are in tension with each other (most likely due to an underestimated – systematic – parallax uncertainty), the uncertainty on the output distance increases.

For a similar, but not identical reason, the addition of Pan-STARRS1 magnitudes to the set of Gaia DR2+2MASS+AllWISE photometry does not improve the distance precision, but has a slight effect in the opposite direction (compare magenta and green lines in Fig. 12). We suggest that this points to an inconsistency between the Gaia DR2 photometry with the Pan-STARRS1 one. Since the Gaia DR2 photometry is of unprecendented precision, and the transmission curves and zeropoints are well-characterised (at least for not too red stars, G BP − G BP ≲ 1.5) by Maíz Apellániz & Weiler (2018), we tentatively suggest that this indicates a need for additional corrections of the PS1 zeropoints. However, we decided to keep the PS1 photometry as input where possible, since the five optical passbands considerably help in increasing the precision of extinction and metallicity (see top middle and bottom middle panel of Fig. 12).

The wiggle in the median uncertainty at G ∼ 13 is due to the decrease in parallax uncertainty at that magnitude transition (Lindegren et al. 2018). The sharp increase in median distance uncertainty at G ≃ 16.5 is due to the transition into the low-signal-to-noise parallax regime. In particular, the distance uncertainties are much larger for faint main-sequence stars, which fill the locus of G DR2 > 16.5 and σ d /d > 0.5, whereas the (predominantly photometric) distances to distant red giants remain more precise (see Fig. 1).

The flag-cleaned results, by construction, yield much more precise results also in the faint regime. The drop in median uncertainty for those stars is due to the distance precision cut embedded in the definition of the SH_OUTFLAG (see Appendix A).

For the precision in A V extinction (top middle panel of Fig. 12), we note a flat trend as a function of G, with the uncertainty increasing significantly only in the regime where parallaxes and distances become much more uncertain (G ≃ 16.5). We also note the expected increase in precision when including more photometric passbands (see also Table 2).

Similar observations hold for the median uncertainties in effective temperature as well as the uncertainties in the secondary output parameters log g, [M/H], and stellar mass M * . For the latter we also note an (at first sight puzzling) decreasing trend of the overall median uncertainty (black line in bottom right panel of Fig. 12) up to G ∼ 14, which is an effect of the different sampling of stellar populations at different magnitudes.

Figure 13 shows the median uncertainties in the primary output parameters d, A V , and T eff for stars in the Galactic disc as a function of their position. The top row shows the precision in each pixel in the X vs. Y plane for all converged stars, while the bottom row shows the corresponding results for the flag-cleaned sample.

The top left panel demonstrates the sharp transition into the low-signal-to-noise parallax regime at heliocentric distances of ∼2.5 kpc (the Gaia DR2 “parallax sphere”). In the bottom left panel, this effect is much less severe, because of many distant giant stars passing the quality criteria of the StarHorse flags. Even in the Galactic bar region, the typical uncertainties for the flag-cleaned sample only amount to ∼30%.

The middle panels of Fig. 13 especially highlight the decrease in A V precision in the quarter of the sky for which no Pan-STARRS1 photometry is available. We also note that outside the solar vicinity our extinction estimates are more precise in regions dominated by giant stars, resulting in a ring around the Sun for which the uncertainties are higher. The same is true for the effective temperatures (right panels), because the two posterior quantities are correlated (see Appendix B for a short discussion of the correlations of correlations in the estimated parameters).

5.2. Accuracy: Comparison to asteroseismology

It is difficult to find true benchmark tests for the distance, extinction, and stellar parameter scales of large surveys that are not themselves affected by significant systematic uncertainties. One of the most precise and widely used anchors in the context of spectroscopic surveys is the asteroseismic surface gravity scale defined by the seismic scaling relations for red giant stars (e.g. Holtzman et al. 2015; Valentini et al. 2016, 2017).

In Fig. 14, we show a comparison to the precise surface gravities, distances, and extinctions determined from asteroseismic data from the Kepler and K2 missions (Khan et al. 2019). The surface gravities have been computed using the ν max scaling relation (Brown et al. 1991; Kjeldsen & Bedding 1995) and can be considered accurate to 0.05 dex (e.g. Noels et al. 2016; Hekker & Christensen-Dalsgaard 2017). Distances and extinctions were derived with the Bayesian stellar parameter estimation code PARAM (da Silva et al. 2006; Rodrigues et al. 2014, 2017), using the global seismic oscillation parameters Δν and ν max as well as effective temperatures and metallicities determined from APOGEE DR14 (Abolfathi et al. 2018) and SkyMapper (Casagrande et al. 2019) for the Kepler and K2 fields, respectively.

The log g comparison shown in the top left panel of Fig. 14 shows that our posterior gravity values perform unexpectedly well, with median biases below the 0.1 dex level. Since the log g information is mostly driven by the parallax measurements, the fact that our posterior log g values agree so well with the values obtained by using Kepler and K2 data underlines the unprecedented quality of the Gaia DR parallaxes. It also means that our global zero-point correction inspired by Zinn et al. (2019) performs very well – slightly better in the Kepler field, as expected, but still acceptably in the K2 fields, although the parallax zero-point in these fields is different (∼ − 0.006 mas in C3 and ∼ − 0.017 mas for C6, compared to ∼ − 0.05 mas for the Kepler field Khan et al. 2019). Another encouraging fact is that for all three fields we get the lowest biases for stars around the red clump (2.3 ≲ log g ≲ 2.7). Although there are few comparison stars in the upper RGB for the C3 field, it seems that these tend to have slightly more biased posterior log g values, in concordance with the different parallax zero-point for that field. In summary, the comparison to the asteroseismically detrmined surface gravities shows that our posterior log g estimates perform better than expected, with biases and precisions at the level of medium-to-high-resolution spectroscopy (at least for luminous red-giant stars out to ∼5 kpc).

Regarding the primary output parameters distance and extinction, the comparison with Khan et al. (2019) shows that our distances to red-giant stars seem to be accurate at the 10% level with respect to the asteroseismic scale at least up to distances of around 5 kpc, with most accurate results achieved for the Kepler field (biases < 1% up to d ≲ 3.8 kpc; see middle panel panel of Fig. 14), for which our parallax zeropoint correction is most accurate. For C3 and C6, the parallax correction most likely overestimates the true parallaxes, which is why the StarHorse distances are systematically lower than those from PARAM on the entire range of distances. As for the extinction comparison (right panel and bottom row of Fig. 14), the picture is similar, with some systematics seen for the most nearby and the most distant stars in the K2 fields, further corroborating the position-dependent parallax zero-point shift reported by Arenou et al. (2018) and Khan et al. (2019).

5.3. Accuracy: Comparison to APOGEE

To further test the accuracy of the StarHorse Gaia DR2 results, we cross-matched them with the stars contained in the fourteenth data release of the Sloan Digital Sky Survey (SDSS DR14; Abolfathi et al. 2018), resulting in 210 545 stars, out of which 179 272 pass all Gaia DR2 StarHorse flags. Furthermore, for this comparison we only consider stars with valid calibrated APOGEE stellar parameters, resulting in a total overlap sample of 59 351 giant stars. In Fig. 15, we show the differences with respect to the spectroscopically derived stellar parameters derived by the APOGEE Collaboration (which are of course of much higher precision), as well as StarHorse distances and extinctions derived from those parameters together with Gaia DR2 parallaxes and additional photometry (same version of the code; Santiago et al., in prep.). We note that no Gaia DR2 photometry was used for the APOGEE StarHorse run.

In the top row of Fig. 15, we show the differences between our photometric estimates and the values derived by the APOGEE Stellar Parameter and Chemical Abundances Pipeline (ASPCAP; García Pérez et al. 2016) as a function of the ASPCAP values, for effective temperature, surface gravity, and metallicity, respectively. Since the APOGEE stellar parameters are completely independent from ours, these comparisons possibly reveal the most important systematics of the results presented in this work. It is worth noting, however, that even the APOGEE sample cannot be considered a gold standard, since the photometric and spectroscopic effective temperature scales depend on the wavelength range and resolution used, and may still be subject to shifts of up to 100 K (e.g. Casagrande et al. 2014; Jönsson et al. 2018). In fact, to remove systematics with respect to the temperature scale defined by the infra-red flux method (González Hernández & Bonifacio 2009), for DR14 and following releases a metallicity- and temperature dependent calibration was applied to the raw ASPCAP results (Holtzman et al. 2018; Jönsson et al. 2018).

The effective temperature comparison shown in the top left panel of Fig. 15 shows very few systematics for the overlap sample inside the ASPCAP calibration range. Our effective temperature scale (defined by the PARSEC 1.2S isochrones) is offset by −65 K on average (median: −46 K) with respect to the APOGEE sample, with the difference being zero around 4500 K and increasing for both cooler and warmer stars. Since this difference is at the level of the systematics expected for the APOGEE T eff scale, we can consider it insignificant. Perhaps more interesting is the overall spread of the temperature difference, which amounts to 197 K and is very similar to the formal uncertainties that StarHorse delivers for T eff .

The second and third panel in the top row of Fig. 15 show an analogous comparison for our secondary output parameters log g and [M/H]. As for T eff , also spectroscopic surface gravity values suffer from some level of systematics (Holtzman et al. 2015; Valentini et al. 2016). In DR14 and subsequent releases, however, the raw ASPCAP values have been carefully calibrated using precise log g values delivered by the CoRoT and Kepler asteroseismic missions (see Holtzman et al. 2018 for details). The systematics seen in the log g comparison are therefore likely to be mainly due to our analysis (i.e. our priors), or intrinsic differences of the gravity scale of the PARSEC models with respect to the asteroseismic scale.

For the comparison to the (calibrated) ASPCAP metallicity scale, which is both much more precise and accurate than ours (∼0.05 dex; Holtzman et al. 2018), we see that StarHorse tends to determine solar metallicities for the bulk of the APOGEE stars. This behaviour shows that the metallicity sensitivity of the broad-band photometric filters used in this work is very small for moderate metallicities ([M/H] ≳ −1), and the posterior metallicity estimates are in most cases dominated by the (broad) metallicity priors. However, for moderately metal-poor objects (−2 ≲ [M/H] ≲ −1), our code seems to deliver somewhat more reliable metallicity estimates, enabling the construction of a candidate list of metal-poor stars, which may be followed up with spectroscopy (Chiappini et al., in prep.).

The middle row of Fig. 15 displays the comparison with the distances to APOGEE stars obtained with the same version of the StarHorse code, but including also the spectroscopic stellar parameters as input quantities, thereby yielding much more precise results (Queiroz et al. 2018). The left panel shows that we achieve remarkable overall concordance with the astro-spectro-photometric distance scale up to distances of ∼7 kpc, with our photo-astrometric distances being increasingly too small towards more distant (especially extragalactic) stars, as could be expected (we note that this trend is comparable to the trend observed by comparing to open-cluster distances discussed in Sect. 5.4, and contrary to the trend observed when comparing to the distances of Bailer-Jones et al. 2018; see Sect. 6.1). The right panel also shows that the distance differences do not show any dependence on sky position, except for the very extincted regions close to the Galactic plane, where we see a tendency to overestimate distances compared to APOGEE (mostly a consequence of our A V = 4 boundary). Overall, however, since also the APOGEE-derived distances are model- and prior-dependent, the meaning of the median trend is limited and can be considered rather an internal consistency check.

The same is true for the extinction comparison (extinction difference as a function of sky position) shown in the bottom row of Fig. 15, although here we see more significant systematic trends. The bottom left panel shows that the median systematic differences as a function of distance are moderate (≲0.2 mag), although certainly significant. In the bottom right panel, however, we observe a slight (≲0.1 mag) overestimation of the median extinction in the low-extinction regime at high latitudes with respect to the APOGEE-derived values, while at low latitudes the Gaia DR2+photometric extinction estimates are on average lower than the APOGEE ones by ∼0.2 mag for most of the parts of the Galactic disc, and severely underestimated in the most extinct regions, thus compensating the larger distances observed in the same sky regions (again mostly due to our A V prior, which is too restrictive for very extincted regions). These caveats should be kept in mind when using our catalogue.

5.4. Accuracy: Open cluster comparison

By virtue of the precise Gaia DR2 astrometry, Cantat-Gaudin et al. (2018a) were able to establish new membership probabilities and physical parameters for 1229 Galactic open clusters. In Fig. 16, we compare our results obtained for the most certain cluster members of Cantat-Gaudin et al. (2018a) to the cluster distances reported in that work. These cluster distances were obtained by a maximum-likelihood analysis taking into account the systematic parallax uncertainties and the global parallax zero-point offset of −0.029 mas (Lindegren et al. 2018).

Overall, Fig. 16 shows similar trends as the comparison to the APOGEE-derived distances (left panel in the middle row of Fig. 15). We see slight negative median differences up for distances between ∼2 and 10 kpc, which, considering the systematic parallax uncertainties, is very much beyond the accuracy limits of the open cluster distance scale (we note the different global zeropoint applied in the bright regime). The obvious advantage of the cluster distances is the suppression of the statistical uncertainty with the square root of the number of members. However, the cluster members are affected by the same varying parallax zero-point as a function of sky position, magnitude, parallax, and/or colour. Therefore, also this comparison is not a fundamental distance comparison for the most distant clusters, since their distance uncertainty is dominated by systematics. In addition, the P memb ==1 criterion used in Fig. 16 only refers to astrometric membership; no photometry was used in the construction of the membership list of Cantat-Gaudin et al. (2018a).

To further illustrate the performance of StarHorse for the open cluster sample, we show in Fig. 17 a detailed comparison for the four most distant open clusters (Melotte 71, NGC 2420, NGC 6819, and NGC 6791) studied at high spectral resolution in the compilation of Bossini et al. (2019). These authors have recently published revised Bayesian cluster parameters based on Gaia DR2 data and the membership list of Cantat-Gaudin et al. (2018a), thus providing also cluster ages and line-of-sight extinctions. In the left column of Fig. 17, we show the distance-extinction plane for each of the clusters, indicating also the size of the StarHorse uncertainties of the individual members. Except for a number of outliers in NGC 6819, and for the problematic cluster NGC 6791 (which is also a highly debated object in the open-cluster literature; see e.g. Linden et al. 2017; Villanova et al. 2018; Martinez-Medina et al. 2018), we find that our results for the bulk of individual member stars cluster very well around the median distances and extinctions of Bossini et al. (2019), within the known systematics.

The middle column of Fig. 17 shows the effect of applying a StarHorse extinction- and distance correction on the cluster colour-magnitude diagram, displaying the amount of noise added to the CMD when using our results (which, we recall, were obtained under the assumption that these stars are field stars). We find that the resulting diagrams are not much more noisy than the original cluster sequences, even for these distant populations, giving further confidence in our results.

Finally, in the right column of Fig. 17 we show the posterior Kiel diagrams for each of the four clusters, colour-coded by the median metallicity in each pixel, demonstrating that there is at least some metallicity sensitivity in the photometric data used in this work.

5.5. Caveats

As can be expected from a data-intensive endeavour such as the one undertaken in this paper, there are several known caveats that should be taken into account when using our results. Some of them were discussed in the previous sections, but we list some additional considerations here. Specifically, for this work we did not attempt to correct the following effects (ordered by decreasing relative importance) that may have potential impacts on further scientific analyses.

Colour- and sky-position-dependent parallax zero-point shifts: As has been demonstrated by Arenou et al. (2018), Lindegren (2018), and Khan et al. (2019), the parallax zero-point offset of Gaia DR2 depends on the magnitude, colour, and position in the sky, in a non-trivial manner. In this work we only account for a magnitude-dependent zero-point offset, which may lead to biased results in parts of the sky where the parallax zero-point shift is very different from the global shift applied here. Unresolved binaries: For most binaries the primary star by far dominates the light budget (especially on the red-giant branch), so that we expect that our results do not suffer from significant binarity-induced biases in that regime. As for the main sequence, the exquisite quality of the Gaia DR2 photometry has shown that many star clusters show a well-populated equal-mass binary sequence. For these cases, we do expect significantly biased parameters. However, currently the only computationally tractable way to account for equal-mass binaries in the data is to use data-driven models for the colour-magnitude diagram (Coronado et al. 2018), which was explicitly not the aim of this work. Simple Galactic priors: Due to optimization of computational resources, our priors do not include some well-established features of the Galaxy: for example a warped stellar disc, extended structures in the outer disc such as the Monoceros ring or Triangulum-Andromeda, or the presence of the nearby Magellanic Clouds, the Sagittarius dSph, etc. (see Sect. 4.4). Also, the current extinction limit of A V = 4 could be replaced by a more informative prior. Uncertainties in the extinction curve: Our results rely on the validity of the assumed extinction curve, which is limited in several respects. Most importantly, we do not allow for variable R V values (or x, in the Schlafly et al. 2016 notation). In addition, by using the G BP magnitudes we extrapolate the Schlafly et al. (2016) extinction law slightly into the blue. In the near future, with the Gaia DR3 BP/RP low-resolution spectra, it may be possible to simultaneously solve for stellar parameters, distance, extinction, and the extinction curve. Gaia DR2 photometry in crowded fields: The Gaia DR2 aperture photometry is known to be prone to systematic errors in crowded regions of the Galactic disc (Evans et al. 2018; Arenou et al. 2018). For many applications, it will be sufficient to filter out data affected by this problem using the phot_bp_rp_excess_factor (as implemented in SH_GAIAFLAG[1] ). Systematic photometric errors in G BP in the faint regime: Faint sources (G BP ≳ 19; 2% of the converged stars) have been shown to be affected by background under-estimation, which leads to magnitude errors greater than 0.02 mag (Arenou et al. 2018, Fig. 34b). This may imply slight biases in our derived parameters for these stars. Systematic photometric errors in supplementary photometry: Systematic photometric errors will result in slightly biased results, especially for the more delicate secondary output parameters. This is the reason why we introduced an uncertainty floor for the photometric data. As an example, in Sect. 5.1 we saw that the inclusion of the Pan-STARRS1 photometry in the input data yields more precise extinction and metallicity estimates, but slightly worsens the distance and log g precision. This fact suggests some remaining tensions between the zero-points or the passband definitions of the Gaia DR2 and Pan-STARRS1 photometric systems. Another potential problem arises for missing photometric uncertainties in 2MASS and WISE (0.03% of the converged sample): in these cases, the catalogue magnitude values refer to upper limits and our fiducial uncertainties of 0.3 mag may be too optimistic. Uncertainties in the Gaia DR2 passband definitions: Although we have used the improved transmission curves and recalibrated photometry of Maíz Apellániz & Weiler (2018), remaining uncertainties in the passband definition may impact our results. Maíz Apellániz & Weiler (2018) have convincingly shown that more flux-calibrated spectro-photometry is necessary to characterise the on-board transmission curves of the Gaia photometers, especially in the red regime. Contamination by extragalactic objects and potentially erroneous cross-matches: In this work we have used the carefully computed crossmatch tables provided as part of Gaia DR2 (Marrese et al. 2019), so that the occurence of erroneous crossmatches should be minimal. Also the contamination of our catalogue by galaxies with observed colours similar to those of stars is possible, although very unlikely in our magnitude regime.

6. Comparison to Gaia DR2 results

Figures 18 through 23 present a comparison of the StarHorse primary output parameters with the widely used Gaia DR2-based distance catalogue of Bailer-Jones et al. (2018), and with the Gaia DR2 astrophysical parameters presented by Andrae et al. (2018). In this section, we discuss these comparisons in detail.

Shortly after Gaia DR2, Bailer-Jones et al. (2018) released a catalogue of geometric Bayesian distances for 1.33 billion stars inferred from the Gaia DR2 parallaxes. Their goal was to provide homogeneous distance estimates for the entirety of Gaia DR2 stars, “independent of assumptions about the physical properties of, or interstellar extinction towards, individual stars”. The authors rely on a solid theoretical background (Bailer-Jones 2015; Astraatmadja & Bailer-Jones 2016b), and carefully calibrated their geometric distance prior as a function of galactic longitude and latitude using a Gaia DR2-like stellar density model (Rybizki et al. 2018), and their results have been shown to provide precise results also beyond the Gaia DR2 parallax sphere.

The main advantages of the approach taken by Bailer-Jones et al. (2018) are 1. a very clean selection function (they provide mode statistics for virtually all 1.33 billion stars with measured parallaxes, and 2. a smooth transition between the likelihood-dominated and the prior-dominated regime of the Gaia DR2 data. According to the authors, the main drawbacks are 1. lower precision than could be achieved by including more information, and 2. biased distances for certain subsets of objects (e.g. distant giants, extragalactic stars). With the StarHorse results, we can now quantify these statements.

Figure 18 shows a comparison of StarHorse and Bailer-Jones et al. (2018) distances. The top left panel shows the median relative distance difference as a function of StarHorse distance. For the flag-cleaned sample, we observe an overall concordance between both distance scales up to distances of ∼3 kpc, and then a continuously growing deviation, in the sense that the Bailer-Jones et al. (2018) distances are typically smaller than the StarHorse ones. This behaviour is expected, since the exponentially decaying space density prior employed by Bailer-Jones et al. (2018) tends to confine stars to distances within ∼6 kpc.

In the top right panel of Fig. 18, we compare the median precision obtained with StarHorse as a function of G magnitude with the formal uncertainties given by Bailer-Jones et al. (2018). Surprisingly at first sight, the median Bailer-Jones et al. (2018) distance uncertainties are smaller than the corresponding StarHorse uncertainties. However, it should be taken in mind that we increased the Gaia DR2 parallax uncertainties, in accordance with the recent analysis of Lindegren (2018). In the same panel of Fig. 18 we also show the (ill-defined) approximations of the uncertainties of inverse parallax distances, with and without our parallax uncertainty recalibration (see Table 1). The offset between these two lines is essentially the same as the one between the Bailer-Jones et al. (2018) distances and the StarHorse results, indicating that the parallax uncertainty is the driving parameter also for our distance precision, and suggesting that the Bailer-Jones et al. (2018) distance uncertainties are slightly underestimated.

The bottom panels of Fig. 18 show the median relative distance deviation as a function of sky position. In the bottom left panel, we show the “all converged stars” sample, while the bottom right one only contains the results with SH_OUTFLAG ==“00000”. The concordance of our flag-cleaned results with the distances of Bailer-Jones et al. (2018) is remarkable over most of the sky (< 5% differences in more than 90% of the HealPix cells), excluding only the Inner Galaxy (|l|≲30°, |b|≲10°) and the Magellanic Clouds. The different picture for the full sample again cautions against the blind use of our non-flag-cleaned median distances.

The remaining differences are most probably due to the fact that the prior used in Bailer-Jones et al. (2018) is more apt for nearby main-sequence stars and therefore tends to underestimate distances to far-away giant stars, especially in the Galactic bulge (see Bailer-Jones et al. 2018). An additional effect is that close to the inner Galactic plane the stellar density does not decrease exponentially, but in fact is a more complex function than described by the exponentially decreasing space density prior. Finally, it is possible that the dust model used by Rybizki et al. (2018) is substantially different from the actual dust distribution in our Galaxy in the Inner Galaxy, and therefore the values of the prior length scale in this region could be underestimated.

Finally, in Fig. 19 we compare the Galactic density maps in Galactic Cartesian co-ordinates. In the region of highest density close to Gaia DR2 parallax sphere, the maps are very similar, as expected. The StarHorse distances reach higher values due to the less restrictive density prior used. The most striking feature, however, is of course the emergence of the Galactic bar as a clear overdensity in the XY plane (see Sect. 4.4).

Summarising this comparison, we can say that due to the relatively low impact of the additional photometric measurements on the distances, our distance estimates are not more precise than the distances obtained by Bailer-Jones et al. (2018), even when rescaling their input parallax uncertainties. However, we argue that at large distances, our values are more accurate due to the choice of more informative Galactic priors, which allows us to see more substructure, including the direct imprint of the Galactic bar in the density maps.

6.2. Comparison to Gaia DR2 Apsis results

As part of Gaia DR2, Andrae et al. (2018) published a catalogue of astrophysical parameters (T eff , A G , E(G BP − G RP ), radius, luminosity). Due to the limited number of observables used (parallax + Gaia DR2 photometry), the output parameters T eff and A G are strongly correlated, and therefore suffer from a number of caveats documented in Gaia DR2 (see Andrae et al. 2018 for an extensive discussion). In this work, we set out to improve on these initial results by including more photometric data in our analysis, thus creating more leverage to break the degeneracy between effective temperature and extinction. Different from Andrae et al. (2018) who used a machine-learning algorithm to infer astrophysical parameters, we have chosen a more classical approach: Bayesian parameter inference over a grid of stellar models. In this sense, both our method and our input data are quite different from the work carried out by Andrae et al. (2018).

Figures 20 and 21 show a comparison of the StarHorse results for A G extinction and effective temperature, respectively, in a similar fashion as for the comparison to the Bailer-Jones et al. (2018) distances in the Fig. 18. The top panels show the median absolute differences between the two results as a function of distance. For the flag-cleaned sample (red density and red running median line), despite the large spread we observe a remarkable overall concordance between the extinction scales of Apsis and StarHorse up to a distance of ∼1 kpc, followed by growing deviations towards larger distances. For nearby stars, we also note the effect of the A G positivity requirement imposed by the DR2 Apsis pipeline. For the effective temperatures, the agreement is significantly worse (we note the large y-axis scale), most likely due to the Apsis assumption of zero extinction and the fact the no stellar population prior was applied to the stellar model training dataset, resulting in too small T eff values for reddened stars (see Gaia Collaboration 2018b; Andrae et al. 2018 for details).

This explanation is confirmed in the bottom rows of Figs. 20 and 21, which show the median absolute A G and T eff differences as a function of sky position, for the full sample and the SH_OUTFLAG -cleaned sample, respectively. We find that the median A G deviations vary over the sky, indicating larger systematic differences close to the Galactic Centre. We can also make out the footprint of the Pan-STARRS1 survey in the A G comparison maps, indicating a slightly different extinction scale when these data are not available. The T eff differences, on the other hand, follow the dust distribution in the Galaxy, an unphysical feature that is not present in the StarHorse data, and that can also be explained by the zero-extinction assumption imposed for the Gaia DR2 Apsis run.

The top right panels of Figs. 20 and 21 compare the quoted statistical uncertainties of StarHorse and Apsis. However, the meaning of the Apsis uncertainties is probably limited, since the uncertainties are certainly dominated by systematics, as we have seen above. We therefore suggest that StarHorse , in addition to providing more accurate results, also provides more realistic uncertainty estimates.

As an additional check, Fig. 22 shows a comparison of the StarHorse extinction-corrected CMD with the one obtained from combining the Bailer-Jones et al. (2018) distances with the extinction estimates by Andrae et al. (2018) for the flag-cleaned sample. Apart from the increase in number counts (137 million vs. 84 million), we observe that the StarHorse results (left panel) produce a much more populated lower main sequence and a more well-defined red clump when compared to the Andrae et al. (2018) CMD (right panel).

Finally, Fig. 23 shows a comparison of two-dimensional A G extinction maps in a narrow distance slice in the Orion region, showing the increase in number of stars with extinction estimates from StarHorse , especially in dense obscured regions, which allows for more substructure to be revealed.

In summary, we can confidently state that our astrophysical parameters are more accurate and have more reliable statistical uncertainties than the initial Apsis parameters obtained as part of Gaia DR2. This was expected from the inclusion of more multi-wavelength observations for a large part of the Gaia DR2 stars (Andrae et al. 2018). With this work we have verified this expectation quantitatively and provided improved results. We expect that with a machine-learning approach similar to that of Andrae et al. (2018) or Das & Sanders (2019), accompanied by a better training sample, our results can be further improved.

In the near future, Gaia eDR3 (envisioned for summer 2020)7 will provide improved photometry and astrometry for a similar number of sources as contained in DR2. This will enable short-term improvements on the results presented in this paper, with StarHorse or similar codes. Gaia DR3 (scheduled for spring 2021), will then provide much more precise astrophysical parameters determined from the BP/RP and RVS spectra. We imagine, however, that there may still be room for further improvements by adding additional constraints (such as near- and mid-infrared photometry).

7. Conclusions

In this work we have derived Bayesian stellar parameters, distances, and extinctions for 265 million stars brighter than G = 18 with the StarHorse code. By combining the precise parallaxes and optical photometry delivered by Gaia’s second data release (Gaia DR2) with the photometric catalogues of Pan- STARRS1, 2MASS, and AllWISE, and the use of informative Galactic priors, our results substantially improve the accuracy of the extinction and effective temperature estimates provided with Gaia DR2 (Andrae et al. 2018), and arguably also the distances for distant giant stars, when compared to Bailer-Jones et al. (2018). When cleaning our results for both unreliable input and output data, we obtain a sample of 137 million stars for which we achieve a median precision of 5% in distance, 0.20 mag in V-band extinction, and 245 K in effective temperature for G ≤ 14, degrading slightly towards fainter magnitudes (12%, 0.20 mag, and 245 K at G = 16; 16%, 0.23 mag, and 260 K at G = 17, respectively).

To verify our results, we presented distance- and extinction-corrected colour-magnitude diagrams, extinction maps as a function of distance, extensive density maps, as well as comparisons to asteroseismology, star clusters, the high-resolution spectroscopic survey APOGEE, and the Gaia DR2 astrophysical parameters and distances themselves. Furthermore, our results have already been used to infer the Galactic escape speed curve (Monari et al. 2018), to study the kinematic structure of the Galactic disc (Quillen et al. 2018; Carrillo et al. 2019), and for the survey simulations of the 4MOST spectroscopic survey (de Jong et al. 2019; Chiappini et al. 2019).

In this paper we also report for the first time a clear manifestation of the Galactic bar directly in the stellar density distributions. Considering that we assumed a vastly different prior for the density in the Galactic bulge, this observation can almost be considered a direct imaging of the Galactic bar. We also find a kinematic imprint of the coherent motion in the Galactic bar in the proper motion maps presented in Sect. 4.5. A more detailed study of the Galactic bulge will be presented in Queiroz et al. (in prep.). We are confident that our value-added dataset will be useful for various other Galactic science cases, such as mapping the three-dimensional dust distribution within the Milky Way, or hunting for metal-poor stars.

We make our results available through the ADQL query interface of the Gaia mirror at AIP ( gaia.aip.de ), and, together with complementary Gaia DR2 information, as binary tables at data.aip.de . The digital object identifier of this dataset is doi:10.17876/gaia/dr.2/51 .

2 For a short and comprehensive review, see Lindegren (2018), accessible at https://www.cosmos.esa.int/web/gaia/dr2-known-issues

3 For simplicity we call our extinction parameter A V , although it refers to A 5420 Å , as advertised by Schlafly et al. (2016).

4 Provided by the Spanish Virtual Observatory’s Theoretical Spectra web server (http://svo2.cab.inta-csic.es/theory/newov2/index.php)

Acknowledgments

The authors thank Benoît Mosser (Paris), Lola Balaguer (Barcelona), Ralf-Dieter Scholz (Potsdam), and the careful referee for valuable comments on the manuscript. We also thank Eleonora Zari (Leiden) for testing a preliminary version of the database. During the preparation of the data for the StarHorse run, we have frequently used the Gaia archive at ESAC (Salgado et al. 2017) as well as its mirrors at AIP and ARI. FA warmly thanks Alcione Mora and Juan González-Núñez (ESAC) for support with the Gaia archive in a critical moment. During the analysis, we have made extensive use of the astronomical java software TOPCAT and STILTS (Taylor 2005), as well as the python packages numpy and scipy (Oliphant 2007), astropy (Astropy Collaboration 2013), dask (Dask Development Team 2016), HoloViews , and matplotlib (Hunter 2007). Figures 3 and B.1 through B.3 were created using the corner package (Foreman-Mackey 2016). For Fig. 9 we used galpy.util.bovy_coords (Bovy 2015) to transform the proper motions to the Galactic frame. This research has made use of the SVO Filter Profile Service (http://svo2.cab.inta-csic.es/theory/fps/; Rodrigo et al. 2012; Rodrigo & Solano 2013) supported from the Spanish MINECO through grant AYA2017-84089. This work has made use of data from the European Space Agency (ESA) mission Gaia (http://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, http://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 800502 H2020-MSCA-IF-EF-2017. This work was partially supported by the MINECO (Spanish Ministry of Economy) through grant ESP2016-80079-C2-1-R (MINECO/FEDER, UE) and MDM-2014-0369 of ICCUB (Unidad de Excelencia María de Maeztu).

References

Appendix A: Justification for StarHorse flag definitions

In the case of astro-photometric data with very uncertain parallaxes, Bayesian inference often results multimodal multidimensional posteriors. However, due to the huge data volume of Gaia DR2, the current StarHorse version only saves the 1D median statistics for each posterior variable. This implies that in the case of a very bi- or even multimodal posterior, the median of one of the output parameters may lie in an unlikely, and sometimes unphysical, part of the model parameter space.

We explained this in Sect. 3.4, and now illustrate the effect in some more detail using Figs. A.1 and A.2. Figure A.1 shows 1% of the full StarHorse output sample in an extinction-corrected Gaia colour-magnitude diagram. Overlaid are some of the PARSEC 1.2S models (for different metallicities) that were used to find the most likely combination of stellar parameters, distance, and extinction for each star. By construction, we expect that most of the stars fall in places compatible with at least one stellar model, and this is indeed the case for the vast majority of stars.

Some stars, however, are situated outside the CMD space defined by the stellar models. As explained briefly in Sect. 3.4, this means that the combination of their median posterior absolute magnitude, distance, and extinction should not be used together. The most prominent unphysical feature in the CMD is certainly the nose feature between the main sequence and the red-giant branch.

We verified that this effect only occurs for faint stars with very uncertain parallaxes( ) – which is when the PDF of inverse parallax becomes seriously unbound (see Fig. 1 and Bailer-Jones 2015; Astraatmadja & Bailer-Jones 2016a; Luri et al. 2018). This results in a poor discrimination between dwarfs and giants for these typically faint (G ≳ 16.5; see Fig. 2) stars. Although their median effective temperatures and extinctions may still be useful, their median 1D distances and other parameters should not be used.

In any case, the flags provided together with the StarHorse catalogue should only be regarded as a guidance. We encourage users of our data to apply their own quality cuts depending on their particular science case.

The “bloody eye” effect for stars with poor parallaxes

In addition to the nose feature in the CMD, Fig. A.2 shows that the StarHorse distances for the full G < 18 sample (left panels) result in a very different appearance of the sampled space density in Galactic co-ordinates when compared to the flag-cleaned StarHorse distances (right panels). This effect is a direct result of the poor data quality for faint stars and their consequently broad distance PDFs, as discussed above and in Sect. 3.4.4. Removing stars with such uncertain distances (typically σ d /d 50 ≳ 0.6) leaves us with a much more meaningful density map.

Appendix B: Parameter correlations and examples of StarHorse joint posterior PDFs

Figure 3 presented the primary output of the StarHorse code for the Gaia DR2 sample in one plot, displaying the distributions and correlations of T eff , d, and A V , and their respective uncertainties, as well as G magnitude and parallax signal-to-noise ratio. In Fig. B.1, we now display also the secondary output parameters log g, [M/H], and M * , and their inter-correlations with the primary parameters. As in Fig. 3, the diagonal panels of Fig. B.1 provide the one-dimensional distributions for each parameter as area-normalised histograms, while the off-diagonal panels show the correlations between each of the output parameters.

In addition to the observations described in Sect. 4.1, we now also observe a griding effect in the [M/H] dimension, due to the finite metallicity resolution of our PARSEC model grid (σ [M/H] ) chosen to optimise the computation cost. However, we recall that the quality of the output photometric metallicities is very diverse, and in many cases dubious (see Sects. 5.3 and 5.5). We recall that the main objective of this paper is to deliver more precise distances, extinctions, and effective temperatures for a larger number of stars than provided in Gaia DR2. The secondary output parameters (log g, [M/H], M * ) mainly serve to attach stellar spectrum templates to Gaia DR2 stars for the 4MOST Simulator (de Jong et al. 2019), and to thereby assess the targeting strategy of the 4MOST low-resolution disc and bulge survey (4MIDABLE-LR; Chiappini et al. 2019).

To further gain insight into the correlations between the output parameters, we now discuss some randomly chosen full posterior PDFs. Ideally, in addition to the marginal median statistics for each output parameter, one would also like to report the full posterior. However, the large size of the posterior data files makes this completely unviable to even store this information (let alone publish it), even for the case of only thousands of stars. In this appendix, we therefore only show a few examples of StarHorse joint posterior PDFs for the interested reader.

Figures B.2 and B.3 show corner plots (Foreman-Mackey 2016) of the full posterior PDF projected onto one- and two-dimensional subspaces. For visibility, we only show contours in each of the plots. Figure B.2 shows examples of stars with well-determined posterior parameters. As discussed in Sect. 5.1, the precision in the output parameters mainly depends on the parallax signal-to-noise ratio, but the precision of effective temperature and extinction estimates is considerably increased when the full information is available.

Finally, Fig. B.3 shows a few cases of stars for which StarHorse was only able to determine very uncertain parameters ( SH_OUTFLAG[0]==“1” ), due to the poor precision of the input parallaxes. These stars, some of them responsible for the unphysical nose in the CMD (see Sect. 4) as well as the bloody-eye effect (see Appendix A) display varying degrees of bimodality in their posterior PDFs, in many cases meaning that StarHorse was unable to decide with certainty if these stars are dwarfs or giants. In consequence, as discussed in Sect. 3.4 and Appendix A, their median output parameters are not necessarily compatible with each other.

Appendix C: Data model

Table C.1 provides the data model for the provided StarHorse output tables.

Appendix D: Example queries

In this appendix we show some examples of ADQL queries that can be used to access the StarHorse Gaia DR2 results via the Gaia DR2 mirror archive at gaia.aip.de .

The first example query shows how to extract the median distance and extinction for the first 50 objects of the flag-cleaned StarHorse sample (using both the SH_GAIAFLAG and the SH_OUTFLAG ; see Sect. 3.4):

SELECT TOP 50 s.glon, s.glat, s.dist50, s.AV50, FROM gdr2_contrib.starhorse AS s WHERE s.SH_OUTFLAG LIKE ‘00000’ AND s.SH_GAIAFLAG LIKE ‘000’

The second example shows how to access the first 50 rows of our results, cross-matched with the Gaia DR2 catalogue, cleaned only for the main StarHorse output flag, SH_OUTFLAG [0]==“0”:

SELECT TOP 50 s.*, g.ra, g.dec FROM gdr2.gaia_source AS g, gdr2_contrib.starhorse AS s WHERE g.source_id = s.source_id AND s.SH_OUTFLAG LIKE ‘0%%%%’

The next example shows how to access the first 50 rows of our catalogue cross-matched to the Gaia DR2 main source table, without cleaning for any StarHorse flags, but selecting only red-clump stars with ruwe < 1.3:

ELECT TOP 50 s.*, g.ra, g.dec FROM gdr2.gaia_source AS g, gdr2_contrib.starhorse AS s WHERE g.source_id = s.source_id AND 4500 < s.teff50 AND s.teff50 < 5000 AND 2.35 < s.logg50 AND s.logg50 < 2.55 AND -0.6 < s.met50 AND s.met50 < 0.4 AND s.ruwe < 1.3

The following example extracts an extinction-corrected, flag-cleaned CMD for stars in a 1° region around the centre of the open cluster NGC 6819 (see Fig. 17, third row):

SELECT g.source_id, s.MG0, s.BPRP0 FROM gdr2.gaia_source AS g, gdr2_contrib.starhorse AS s WHERE g.source_id = s.source_id AND s.SH_OUTFLAG LIKE ‘00000’ AND s.SH_GAIAFLAG LIKE ‘000’ AND 1=CONTAINS(POINT(‘GALACTIC’,g.l,g.b), CIRCLE(‘GALACTIC’,74.,8.5, 1.))

Finally, the last example shows how to retrieve all Gaia DR2 stars for which StarHorse did not converge:

SELECT g.source_id, g.l, g.b, g.parallax, g.parallax_error, g.phot_g_mean_mag, g.phot_bp_mean_mag, g.phot_rp_mean_mag FROM gdr2.gaia_source AS g LEFT OUTER JOIN gdr2_contrib.starhorse AS s ON (g.source_id=s.source_id) WHERE g.phot_g_mean_mag <= 18.0 AND s.source_id IS NULL A94, page

All Tables

All Figures