The following article isOpen access

Evolution of the Exoplanet Size Distribution: Forming Large Super-Earths Over Billions of Years

Trevor J. David,Gabriella Contardo,Angeli Sandoval,Ruth Angus,Yuxi (Lucy) Lu,Megan Bedell,Jason L. Curtis,Daniel Foreman-Mackey,Benjamin J. Fulton,Samuel K. Grunblatt

Published 2021 May 14 • © 2021. The Author(s). Published by the American Astronomical Society.
The Astronomical Journal,Volume 161,Number 6Citation Trevor J. Davidet al 2021AJ161 265DOI 10.3847/1538-3881/abf439

DownloadArticle PDF

DownloadArticle ePub

You need an eReader or compatible software to experiencethe benefits of the ePub3 file format.

Trevor J. David

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

tdavid@flatironinstitute.org

https://orcid.org/0000-0001-6534-6246

Gabriella Contardo

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

https://orcid.org/0000-0002-3011-4784

Angeli Sandoval

AFFILIATIONS

Department of Physics and Astronomy, Hunter College, City University of New York, New York, NY 10065, USA

https://orcid.org/0000-0003-1133-1027

Ruth Angus

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

Department of Astronomy, Columbia University, 550 West 120th Street, New York, NY, USA

https://orcid.org/0000-0003-4540-5661

Yuxi (Lucy) Lu

AFFILIATIONS

Department of Astronomy, Columbia University, 550 West 120th Street, New York, NY, USA

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

https://orcid.org/0000-0003-4769-3273

Megan Bedell

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

https://orcid.org/0000-0001-9907-7742

Jason L. Curtis

AFFILIATIONS

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

https://orcid.org/0000-0002-2792-134X

Daniel Foreman-Mackey

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

https://orcid.org/0000-0002-9328-5652

Benjamin J. Fulton

AFFILIATIONS

California Institute of Technology, Pasadena, CA 91125, USA

IPAC-NASA Exoplanet Science Institute, Pasadena, CA 91125, USA

https://orcid.org/0000-0003-3504-5316

Samuel K. Grunblatt

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

https://orcid.org/0000-0003-4976-9980

Erik A. Petigura

AFFILIATIONS

Department of Physics and Astronomy, University of California, Los Angeles, CA 90095, USA

https://orcid.org/0000-0003-0967-2893

Skip to each figure in the article

Skip to each table in the article

Skip to each data item in the article

What is article data?

Trevor J. David

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

tdavid@flatironinstitute.org

https://orcid.org/0000-0001-6534-6246

Gabriella Contardo

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

https://orcid.org/0000-0002-3011-4784

Angeli Sandoval

AFFILIATIONS

Department of Physics and Astronomy, Hunter College, City University of New York, New York, NY 10065, USA

https://orcid.org/0000-0003-1133-1027

Ruth Angus

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

Department of Astronomy, Columbia University, 550 West 120th Street, New York, NY, USA

https://orcid.org/0000-0003-4540-5661

Yuxi (Lucy) Lu

AFFILIATIONS

Department of Astronomy, Columbia University, 550 West 120th Street, New York, NY, USA

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

https://orcid.org/0000-0003-4769-3273

Megan Bedell

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

https://orcid.org/0000-0001-9907-7742

Jason L. Curtis

AFFILIATIONS

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

https://orcid.org/0000-0002-2792-134X

Daniel Foreman-Mackey

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

https://orcid.org/0000-0002-9328-5652

Benjamin J. Fulton

AFFILIATIONS

California Institute of Technology, Pasadena, CA 91125, USA

IPAC-NASA Exoplanet Science Institute, Pasadena, CA 91125, USA

https://orcid.org/0000-0003-3504-5316

Samuel K. Grunblatt

AFFILIATIONS

Center for Computational Astrophysics, Flatiron Institute, New York, NY 10010, USA; tdavid@flatironinstitute.org

Department of Astrophysics, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA

https://orcid.org/0000-0003-4976-9980

Erik A. Petigura

AFFILIATIONS

Department of Physics and Astronomy, University of California, Los Angeles, CA 90095, USA

https://orcid.org/0000-0003-0967-2893

Article metrics

8369 Total downloads
0 Video abstract views

Share this article

Dates

Received2020 November 19
Revised2021 March 18
Accepted2021 March 23
Published2021 May 14

Unified Astronomy Thesaurus concepts

Exoplanets;Exoplanet evolution;Exoplanet astronomy;Super Earths;Mini Neptunes

Journal RSS

Create or edit your corridor alerts

What are corridors?

1538-3881/161/6/265

Abstract

The radius valley, a bifurcation in the size distribution of small, close-in exoplanets, is hypothesized to be a signature of planetary atmospheric loss. Such an evolutionary phenomenon should depend on the age of the star–planet system. In this work, we study the temporal evolution of the radius valley using two independent determinations of host star ages among the California–Kepler Survey (CKS) sample. We find evidence for a wide and nearly empty void of planets in the period–radius diagram at the youngest system ages (≲2–3 Gyr) represented in the CKS sample. We show that the orbital period dependence of the radius valley among the younger CKS planets is consistent with that found among those planets with asteroseismically determined host star radii. Relative to previous studies of preferentially older planets, the radius valley determined among the younger planetary sample is shifted to smaller radii. This result is compatible with an atmospheric loss timescale on the order of gigayears for progenitors of the largest observed super-Earths. In support of this interpretation, we show that the planet sizes that appear to be unrepresented at ages ≲2–3 Gyr are likely to correspond to planets with rocky compositions. Our results suggest that the size distribution of close-in exoplanets and the precise location of the radius valley evolve over gigayears.

Export citation and abstractBibTeX RIS

Previous article in issue

Next article in issue

Original content from this work may be used under the terms of theCreative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

By far the most intrinsically common planets known are small (<4R_⊕), close-in (<1 au) planets. NASA’s Kepler mission (Borucki et al.2010) revealed the surprising abundance of these planets; some 30%–60% of Sun-like stars host a small, close-in planet, depending on assumptions about the intrinsic multiplicity and inclination dispersion within planetary systems (Fressin et al.2013; Petigura et al.2013; Zhu et al.2018; He et al.2019). An enduring mystery posed by small planets is how some accreted sizable atmospheres while others appear to have avoided runaway accretion altogether (e.g., Ikoma & Hori2012; Lee et al.2014; Lee & Chiang2016). Oftentimes Kepler multiplanet systems host planets both with and without atmospheres, in some cases separated from one another by only a hundredth of an astronomical unit (Carter et al.2012).

Recent progress in understanding small planets has been fueled by improved precision in stellar and planetary parameters. Through homogeneous spectroscopic characterization of >1300 Kepler planet hosts, the California–Kepler Survey (CKS; Johnson et al.2017; Petigura et al.2017) revealed that the size distribution of close-in (P < 100 days) small planets is bimodal, with a valley in the completeness-corrected radius distribution between 1.5 and 2R_⊕ (Fulton et al.2017). The radius valley is widely believed to be a signature of atmospheric loss. This belief is bolstered by determinations of planet densities on either side of the valley; planets below the valley, dubbed super-Earths, have densities consistent with a rocky composition, while planets above the valley, known as sub-Neptunes, require atmospheres of a few percent by mass to explain the low measured densities (e.g., Weiss & Marcy2014; Rogers2015). In the atmospheric loss model, some fraction of super-Earths are the remnant cores of planets that shed their primordial envelopes, which potentially alleviates the issue of neighboring planets with dissimilar densities (e.g., Lopez & Fortney2013; Owen & Morton2016). While exploration of the radius valley among planets orbiting low-mass stars has provided support for an alternative hypothesis (formation in a gas-poor disk without the need for atmospheric loss; Cloutier & Menou2020), atmospheric erosion remains the leading theory for planets around Sun-like stars.

Atmospheric loss requires energy. Energy deposited into a planet’s atmosphere from an internal or external source can heat gas to velocities exceeding the planet’s escape velocity. External mechanisms of energy deposition include photoevaporation (heating of the planet’s thermosphere by X-ray and extreme ultraviolet radiation; e.g., Owen & Jackson2012) and impacts by planetesimals or planetary embryos (Liu et al.2015; Inamdar & Schlichting2016; Chatterjee & Chen2018; Wyatt et al.2020). Internal energy deposition can be provided by the luminosity of a planet’s cooling core (e.g., Ginzburg et al.2016). Notably, planetary evolution models studying the effect of photoevaporation predicted the existence of a radius valley before it was observed (Lopez & Fortney2013; Owen & Wu2013; Jin et al.2014; Chen & Rogers2016). However, subsequent studies considering the effects of core-powered mass loss were also able to reproduce the bimodal radius distribution of small planets (Ginzburg et al.2018; Gupta & Schlichting2019,2020). Photoevaporation and core cooling remain the two leading explanations for the radius valley, and both processes may well be important, but to determine the relative importance of the two effects will require a better understanding of the dependence of the valley on other key parameters.

Determining how empty the radius gap is represents an important step toward understanding its origins. In the initial CKS sample, typical planet radius uncertainties were comparable to the width of the gap so that an intrinsically empty gap would not have been resolved (Fulton et al.2017). Van Eylen et al. (2018) studied planets orbiting a subset of Kepler host stars with precise asteroseismic parameters (including ages ranging from ∼2 to 10 Gyr) and found a gap considerably wider and emptier than that found in the initial CKS sample. Including trigonometric parallaxes from Gaia DR2, Fulton & Petigura (2018) were able to improve the medianR_* errors by a factor of 5 in the CKS sample, but the gap remained populated. Those authors presented simulations that suggest that the gap is not empty (i.e., solely filled in by noisy data), and that there are real planets in the gap. More recently, however, Petigura (2020) showed that a sizable number of planets in and around the gap have poorly determined radii due to high impact parameters, indicating that the gap may be emptier than previously appreciated.

It also appears that the gap, which is a 1D projection of a higher-dimensional manifold, is partially filled in due to a dependence of the gap center on orbital period (or stellar light intensity) and host star mass. The gap center is anticorrelated with orbital period (Fulton et al.2017; Van Eylen et al.2018; MacDonald2019; Martinez et al.2019; Loyd et al.2020), which is considered compatible with both the photoevaporation (e.g., Jin & Mordasini2018; Lopez & Rice2018; Owen & Wu2013,2017) and core-powered mass-loss models (e.g., Gupta & Schlichting2019,2020) but incompatible with formation in a gas-poor disk (Lopez & Rice2018); at larger orbital periods, only the smallest and least massive cores are susceptible to total atmospheric loss, driving the gap to smaller radii. The length of the radius valley, i.e., its outer boundary in either period or insolation, may also provide clues to its origin, though this parameter remains poorly studied. In the photoevaporation model, the radius valley should not extend beyond orbital periods of 30–60 days, as the incident X-ray and ultraviolet (XUV) flux is believed to be too low to drive substantial mass loss (Owen & Wu2017). However, the low completeness of the Kepler data set for small planets at these orbital periods presents a challenge for detecting such a transition point.

The gap center is positively correlated with stellar mass (Fulton & Petigura2018; Wu2019; Berger et al.2020a; Cloutier & Menou2020; Hansen et al.2021; Van Eylen et al.2021), although it has been suggested that this trend is due to the relationship between stellar mass and planetary insolation (Loyd et al.2020). The measured mass dependence of the gap has been used to argue support for photoevaporation (e.g., Wu2019) but requires that the average planet mass scale approximately linearly with host star mass, an assertion that has not been verified for small planets. By comparison, in the core-powered mass-loss model, the dependence of the radius gap location on stellar mass is a natural consequence of the dependence of planet equilibrium temperature (which partially determines the mass-loss rate in the Bondi-limited regime) on the stellar mass–luminosity relation (e.g., Gupta & Schlichting2020).

As for metallicity, there is tentative evidence for a wider radius valley for metal-rich stars (Owen & Murray-Clay2018). Such a dependence could result if the core mass distributions, core bulk densities, or initial atmospheric mass fractions of small planets depend sensitively on the metallicity of the host star, and hence the protoplanetary disk. There is evidence that large Kepler planets (2–8R_⊕) are more common around higher-metallicity stars (Dong et al.2018; Petigura et al.2018) and that planets at short orbital periods are preferentially larger around higher-metallicity stars (Owen & Murray-Clay2018). Both findings are compatible with a scenario in which metal-rich stars form more massive cores, on average. It has also been suggested that metal-rich stars host planets with higher atmospheric metallicities, which increases the efficiency of atomic line cooling in photoevaporative flows and decreases mass-loss rates (Owen & Murray-Clay2018). In the core-powered mass-loss model, the rate at which sub-Neptunes cool and contract is anticorrelated with the opacity of the envelope, which is assumed to be proportional to the stellar metallicity (Gupta & Schlichting2020). Thus, in both the photoevaporation and core-cooling models, larger sub-Neptunes and a consequently wider radius valley are expected around more metal-rich stars (for fixed mass and age and neglecting any potential scaling between metallicity and core mass distributions).

The characteristic timescale for atmospheric loss among close-in exoplanets has been proposed as a key parameter for assessing the relative importance of photoevaporation and core-powered mass loss. Firm observational constraints on that timescale, however, are lacking. Constraining this timescale through exoplanet population studies may provide a means for discerning the relative importance of proposed mass-loss mechanisms. Core-powered mass loss is believed to operate over gigayear timescales (Ginzburg et al.2016,2018; Gupta & Schlichting2019,2020). By comparison, photoevaporation models predict that the majority of mass loss occurs during the first 0.1 Gyr (e.g., Lopez et al.2012; Owen & Jackson2012; Lopez & Fortney2013; Owen & Wu2013,2017), corresponding roughly to the length of time a Sun-like star spends as a saturated X-ray emitter (e.g., Jackson et al.2012; Tu et al.2015). However, a more recent study found that the majority of the combined X-ray and extreme UV emission of stars occurs after the saturated phase of high-energy emission, implying that XUV irradiation of exoplanet atmospheres continues to be important over gigayear timescales (King & Wheatley2021). If valid, then observational constraints on exoplanet evolution timescales may not provide a conclusive means for discerning the relative importance of photoevaporation and core-powered mass loss. Nevertheless, there is evidence that the detected fraction of super-Earths to sub-Neptunes increases over gigayears, suggesting that the sizes of at least some planets evolve on these long timescales (Berger et al.2020a; Sandoval et al.2021).

A basic prediction of atmospheric loss models is that the radius gap is wider at younger ages and fills in over time; at a fixed value of high-energy incident flux and initial atmospheric mass fraction, photoevaporation models predict that sub-Neptunes with the least massive cores (and smallest core sizes) will cross the gap first, with more massive cores crossing the gap at later times, if at all. As a result, the radius valley is expected to be wider and emptier at early times, progressively filling in with stripped cores of ever larger masses and sizes (e.g., Rogers & Owen2021). In the core-powered mass-loss model, Gupta & Schlichting (2020) suggested that the average size of sub-Neptunes is expected to decline with age while the average size of super-Earths remains relatively constant, again leading to a wider and emptier radius valley at earlier times.

While the specific theoretical predictions for the age dependence of the radius valley morphology are uncertain, the fundamental prediction from atmospheric loss models that this feature should weaken with increasing age is a firm conclusion. We aim to investigate this hypothesis using the CKS sample. Here we investigate the time evolution of the exoplanet radius gap. In Section2, we describe our sample selection process, including several filters intended to rid our sample of stars or planets with unreliable parameters. Our analysis procedures are discussed in Section3, and finally, we interpret our results and summarize our primary findings in Section4.

2. Sample Selection

We began with the CKS VII sample published in Fulton & Petigura (2018, hereafterF18). The CKS VII sample is a well-characterized subset of all Kepler planet candidates. Stellar characterization for these stars was performed in a homogeneous manner, with spectroscopicT_eff, $\mathrm{log}g$ , and [Fe/H] derived from high signal-to-noise ratio (S/N), high-dispersion Keck/HIRES spectra (Johnson et al.2017; Petigura et al.2017). Using their spectroscopicT_eff and bolometric luminosities computed from Gaia DR2 parallaxes (Gaia Collaboration et al.2018), extinction-corrected Two Micron All Sky Survey (2MASS)K_s magnitudes (Cutri et al.2003), and theoretical bolometric corrections from the MESA Isochrones and Stellar Tracks (MIST; Choi et al.2016; Dotter2016),F18 derived stellar radii from the Stefan–Boltzmann law. They additionally computed ages for the CKS sample using theisoclassify package (Huber et al.2017), which also depends on the MIST models. We use theF18 median posterior isochrone ages as one source of age in the analysis that follows.

We constructed several filters, many motivated by the cuts outlined inF18, to refine the sample and select those planets and stars with the most reliable parameters. The filters are enumerated as follows.

1.
Planet orbital period. We restricted our analysis to planets with orbital periods <100 days. At larger periods, Kepler suffers from low completeness, particularly for small planets.
2.
Planet size. We restricted our analysis to planets with sizes <10R_⊕.
3.
Planet radius precision. We restricted our analysis to planets with fractional radius uncertainties ${\sigma }_{{R}_{P}}/{R}_{P}\lt 20 \%$ .
4.
Planet false-positive designation. We excluded planets identified as false positives in Table 4 of the CKS I paper (Petigura et al.2017), which synthesized dispositions from Mullally et al. (2015), Morton et al. (2016), and the NASA Exoplanet Archive (as accessed on 2017 February 1; Akeson et al.2013).
5.
Stellar radius (dwarf stars). We restricted our analysis to dwarf stars with the following condition:
The left-hand side of this condition excludes a small number of stars far below the main sequence that may have erroneous parameters. The right-hand side excludes stars that have evolved considerably away from the main sequence. This cut is depicted in Figure1. We additionally excluded cool stars elevated from the main sequence that would result in unrealistically old ages. This cut was performed by requiringT_eff >T_isoc, whereT_isoc is the temperature of a log(age) = 10.25, [Fe/H] = +0.25 MIST v1.1 nonrotating isochrone with an equivalent evolutionary point (EEP) <500.
6.
Stellar mass. We wish to isolate the effect of stellar age on the exoplanet radius gap while minimizing the effects of stellar mass as much as possible. We restricted our sample to stars with masses 0.75 <M_⋆/M_⊙ < 1.25, where the masses were derived from stellar evolution models inF18.
7.
Stellar metallicity. For the same reason we confined our sample in stellar mass, we restricted our analysis to stars with spectroscopically determined metallicities in the range −0.3 < [Fe/H] < +0.3.
8.
Isochrone parallax. We removed stars where the Gaia andF18 “spectroscopic” or “isochrone” parallaxes differed by more than 4σ, where the latter quantities were computed in Table 2 of CKS VII. It was speculated byF18 that such discrepancies may be due to flux contamination from unresolved binaries. This cut also removes all stars where the CKS VII isochrone-derived radius,R_iso, differed by more than 10% from the radius derived from the Stefan–Boltzmann law.
9.
Stellar dilution (Gaia). We used ther₈ column in Table 2 of CKS VII to exclude stars with closely projected sources detected by Gaia that contribute a nonnegligible fraction of the optical flux in the Kepler aperture. We excluded stars where additional sources in an 8″ radius (two Kepler pixels) contribute more than 10% of the cumulativeG-band flux (including the target).
10.
Stellar dilution (imaging). As inF18, we excluded Kepler Objects of Interest (KOIs) with closely projected stellar companions bright enough to require corrections to the planetary radii of 5% or more. LikeF18, we use the radius correction factor (RCF) computed by Furlan et al. (2017) based on high-resolution imaging from several authors, accepting planets for which RCF < 1.05.
11.
Unresolved binaries (Gaia). We excluded stars with Gaia renormalized unit weight error (RUWE) values >1.4, where these values were queried from the Gaia archive.⁸ The RUWE is a goodness-of-fit metric for a single-star astrometric model (Lindegren et al.2018). It is a sensitive indicator of unresolved binaries (Belokurov et al.2020), which are a concern for planet radius studies due to potential flux dilution or misidentification of the planet host.
12.
Discrepant photometry. We removed stars with discrepant optical brightnesses, ∣G −K_P∣ > 1 mag, indicating a potentially erroneous cross-match between the Kepler and Gaia sources.
13.
Reddening. We removed stars with reddening estimates ofA_V > 0.5 mag, where these estimates were sourced from Lu et al. (2021). Stars with high reddening are more susceptible to erroneously determined stellar parameters.
14.
Planets with grazing transits. Due to degeneracies inherent to light-curve modeling, planets with grazing transits can have poorly constrained radii. Petigura (2020) showed that there is some level of contamination of the radius valley from planets with grazing transits. Since impact parameters measured from long-cadence photometry are unreliable, we follow Petigura (2020) and exclude planets withR_τ < 0.6, whereR_τ is the ratio of the measured transit duration to the duration of ab = 0,e = 0 transit with the same period around the same star.

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** Top: the H-R diagram for the CKS VII sample (Fulton & Petigura2018). Point colors indicate median posterior ages from that work. Dashed lines indicate the dwarf star selection criteria explained in Section2. Points circled in black were excluded from our analysis, and points indicated by crosses lack ages. Gray curves indicate solar-metallicity, nonrotating MIST v1.1 isochrones (Choi et al.2016; Dotter2016). Bottom: distributions of host star masses (left), ages (middle), and metallicities (right) after performing the dwarf star cut. Note that stars hosting multiple planets are represented more than once in these distributions. All parameters originate fromF18.
Download figure:
Standard image High-resolution image

The overall CKS VII sample contains 1913 planets orbiting 1189 unique stellar hosts. In the analysis that follows, we will refer to the base sample (constructed from the first five cuts enumerated) and the filtered sample (constructed from all of the filters). After applying the filters, the base sample consists of 1443 planets orbiting 871 unique stellar hosts. The filtered sample consists of 732 planets orbiting 466 unique hosts. Later, we find that our analysis is insensitive to many of these restrictions and relax most of them.

2.1. Rotation Period Vetting

We supplemented the CKS sample with stellar rotation periods, which we use in Section3.2 to empirically age rank planet hosts, compiled from the literature. In order to perform an accurate rotation-based selection, it is imperative to have reliable rotation periods. To this end, we performed visual vetting of the full Kepler light curves for each star in the CKS sample. For each star, our period vetting procedure consisted of the following steps.

1.
Retrieve the full Kepler long-cadence PDCSAP light curve (Smith et al.2012; Stumpe et al.2012) from MAST,⁹ mask known transits using ephemerides from the KOI cumulative table,¹⁰ mask data with nonzeroPDCSAP_QUALITY flags, and median normalize each quarter of data.
2.
Compile published rotation period measurements from four sources in the literature (McQuillan et al.2013; Walkowicz & Basri2013; Mazeh et al.2015; Angus et al.2018).
3.
Perform a Lomb–Scargle (L-S) periodogram analysis of the Kepler PDCSAP light curve using theLombScargle class in theastropy.timeseries package.
4.
Phase-fold the PDCSAP light curve on the L-S peak power period and any published period, as well as on the first harmonic and subharmonic of each of the previously mentioned periods.
5.
Generate a vetting sheet including all phase-folded light curves, the L-S periodogram, a 120 day segment of the light curve, and the full light curve.
6.
Visually examine each vetting sheet, recording the preferred period source and assigning a reliability flag to each period determination (3: highly reliable; 2: reliable; 1: period could not be unambiguously determined; and 0: no periodicity evident).¹¹

The results of our rotation period vetting are summarized in Table1. Of the 1189 unique planet hosts in the CKS VII sample, which are predominantly FGK main-sequence stars, we found that approximately 22% have highly reliable rotation periods, 23% have reliable periods, 34% could not have periods determined unambiguously from the light curve, and 21% had no clear periodicity evident in the light curve.

Table 1. Rotation Periods of KOIs in CKS VII Sample

KOI	KIC	P_rot	P_rot Ref.	Flag	A18P_rot	M13P_rot	M15P_rot	W13P_rot	D21P_rot
1	11446443	⋯	⋯	1	24.85	⋯	70.55	⋯	43.37
2	10666592	⋯	⋯	1	19.60	⋯	70.69	⋯	46.65
6	3248033	⋯	⋯	0	22.77	⋯	⋯	⋯	⋯

Note. Flag meanings are as follows: 3, highly reliable; 2, reliable; 1, true period could not be unambiguously determined; and 0, no periodicity evident. References: A18, Angus et al.2018; M13, McQuillan et al.2013; M15, Mazeh et al.2015; W13, Walkowicz & Basri2013; D21, this work. Only a portion of the table is shown here to demonstrate its form and content.

Only a portion of this table is shown here to demonstrate its form and content. Amachine-readable version of the full table is available.

Download table as: Data Typeset image

3. Analysis

3.1. Evolution of the P-R Diagram: Isochrone Ages

We first plotted the period–radius (P-R) diagram for CKS planets in four bins of log(age): <9.25, 9.25–9.5, 9.5–9.75, and ≥9.75 for both the filtered and base samples (Figures2 and3, respectively).¹² This binning scheme was in part chosen because there are few planets with log(age) < 9 or >10. From these figures, we observed a conspicuous void of planets around the radius valley in the youngest age bin (≲1.8 Gyr). Moreover, the slope of the void appears to be very close to the slope of the radius valley determined in Van Eylen et al. (2018, hereafterV18) from a subset of planets orbiting stars with precise asteroseismic parameters. However, while the slope of the young planet void appears to be consistent with that of the radius valley, the intercept appears to be different. This is evident from the fact that the lower boundary of young sub-Neptunes straddles theV18 line, while the super-Earths are well separated from theV18 valley. In other words, there appears to be a dearth of large super-Earths at younger ages, resulting in an apparent shift in the peak of the super-Earth radius distribution to larger radii at older ages. Our interpretation of this shift is discussed in Section4.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** First row: evolution of the Kepler planet population in the P-R diagram for the filtered CKS sample (age bins indicated above each panel). Contours show Gaussian kernel density estimates of planets in the overall CKS sample. The black dashed line indicates the radius valley derived byV18. The gray shaded region indicates the 25% pipeline completeness contour calculated from the CKS sample. Second row: 1D distributions of planet radii for the samples plotted above in each case. The nominal location of the radius gap from Fulton et al. (2017) is indicated by the vertical gray stripe. Third row: our base CKS planet host sample in theT_eff –R_* plane (gray) with the host stars in the age bins indicated at the top (pink). Stars hosting planets in the radius range 1.6–1.9R_⊕ are outlined in black. Fourth row: as in the third row, the distribution of planet hosts in the mass–metallicity plane.
Download figure:
Standard image High-resolution image

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.** Same as Figure2 but for the base sample.
Download figure:
Standard image High-resolution image

While age, mass, and metallicity are correlated in the CKS sample, we show that the distribution of masses in each age bin is not changing drastically. The age–metallicity gradient is stronger, and essentially all stars in the youngest age bin are metal-rich. However, it is also clear that stars hosting planets in the radius valley are not exclusively metal-poor but rather have a wide range of metallicities. Additionally, while the [Fe/H] distributions in the two youngest age bins are broadly similar, the distributions of planets in the P-R diagram are markedly different. While these observations offer some degree of assurance that observed features in the P-R diagram are due to age rather than mass or metallicity, we explore the effects of stellar mass and metallicity further in Sections3.5 and3.7.

3.2. Evolution of the P-R Diagram: Gyrochronology

To this point, we have only considered isochrone ages fromF18 in our analysis. While we present a qualitative validation of theF18 ages in AppendixB, there are substantial uncertainties associated with isochrone ages. It is also possible to empirically age rank the CKS sample with a gyrochronology analysis. Recently, Curtis et al. (2020) presented empirical gyrochrones for several open clusters that enable high-fidelity, model-independent age ranking of solar-type stars with well-determinedT_eff and rotation periods.

We separated the CKS sample into “fast” and “slow” rotators using the hybrid NGC 6819 + Ruprecht 147 gyrochrone of Curtis et al. (2020), corresponding to an age of ∼2.7 Gyr. To perform this cut, we first converted the gyrochrone from a (B_P–R_P)–P_rot relation into aT_eff –P_rot relation using the color–temperature polynomial relation presented by those authors (valid for the temperatures considered here).¹³

We constructed the young planet sample from the CKS base sample (Section2) by choosing host stars placed between the 0.12 (Pleiades) and 2.7 Gyr gyrochrones in theT_eff –P_rot diagram, high-reliabilityP_rot flags, RUWE < 1.4 (to remove unresolved binaries with unreliable rotation periods), andT_eff < 6000 K. TheT_eff < 6000 K cut is motivated by the fact that gyrochrones cluster closely for hotter stars and small temperature uncertainties can translate to large uncertainties in age from a gyrochronology analysis.

The old rotation-selected planet sample was selected in the same fashion from stars lying above the 2.7 Gyr gyrochrone, except for the high-reliabilityP_rot flag requirement. We found that we assigned the high-reliability flag more often to faster rotators that tend to exhibit higher amplitude and more stable brightness modulations (presumably due to larger, longer-lived spots), while the older, more slowly rotating stars exhibit smaller-amplitude, more sporadic Sun-like variations potentially due to smaller, short-lived spots. Thus, in the slow rotator sample, we accepted stars with either reliable or highly reliableP_rot flags.

The distributions of the young and old rotation-selected planet hosts in theT_eff –P_rot and Hertzsprung–Russell (H-R) diagrams are shown in Figure4, along with the 1D distributions of their median isochrone ages. We observe that the fast-rotating sample does indeed correspond to stars that lie closer to the zero-age main sequence with median isochrone ages strongly skewed toward younger ages (mostly below 3 Gyr) relative to the CKS sample. The more slowly rotating stars show a distribution of median isochrone ages that is practically indistinguishable from the bulk of the CKS sample but with a significant number of stars in the ∼1–3 Gyr range. However, this is likely to be the result of isochrone clustering on the main sequence and not because those stars are actually young. We also note that only 46% (546/1189) of the stars in our sample have assigned rotation periods from our period vetting procedure. For the remaining stars, it was not possible to unambiguously assign a period. Those stars are preferentially more evolved relative to the periodic sample, though they are observed across the H-R diagram. Thus, the modest decrement at old ages in the age distribution of the slow rotators relative to the overall CKS sample may be the result of a finite active lifetime for solar-type stars.

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.** TheT_eff –P_rot diagram (left column), H-R diagram (middle column), and median isochrone age distributions (right column) for the young (top row) and old (bottom row) rotation-selected samples. The shaded contours in the left and middle columns show Gaussian kernel density estimates of the full CKS sample in those respective planes. In the left column, the solid, dashed, dashed–dotted, and dotted lines indicate polynomial fits to the empirical gyrochrones of the Pleiades (≈0.12 Gyr), Praesepe (≈0.63 Gyr), NGC 6811 (≈1 Gyr), and NGC 6819 + Ruprecht 147 (≈2.7 Gyr) clusters, respectively (Curtis et al.2020). In the middle panels, the gray curves show solar-metallicity, nonrotating MIST v1.1 isochrones from log(age) = 9–10 in steps of 0.25 dex.
Download figure:
Standard image High-resolution image

After separating the planet hosts into the fast and slow rotator samples, we examined the distributions of the corresponding planet populations in the P-R diagram (Figure5). We observe qualitatively similar behavior as to what was found when using isochrone ages to perform age cuts. That is, there is a dearth of exoplanets in the radius valley among planets empirically determined to be younger than ∼2.7 Gyr from a gyrochronology analysis. Among the planets older than ∼2.7 Gyr, the radius valley appears more filled in. Moreover, the slope and boundaries of the radius valley in the left panel of Figure5 appear to be very close to those derived from the isochrone age-selected sample (as described in Section3.3).

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.** The P-R diagram for exoplanets orbiting stars rotating more rapidly (left) or more slowly (right) than an empirical 2.7 Gyr gyrochrone. The shaded contours represent a 2D Gaussian kernel density estimation for the overall CKS sample. The black dashed lines indicate the margins of the young planet void derived from the isochrone-selected sample in Section3.3.
Download figure:
Standard image High-resolution image

3.3. Measuring the Slope of the Void

As the slope of the radius valley contains information about the mechanism(s) responsible for producing it, we proceeded to characterize the void for four planetary samples described as follows. Each sample is a subset of the base sample, sharing the following cuts: planets orbiting dwarf stars (described in Section2) withP < 100 days,R_P < 10R_⊕, and ${\sigma }_{{R}_{P}}/{R}_{P}\lt 20$ %. Theisoc_fgk_1to2 sample also employs the age restriction $9\lt \mathrm{log}(\mathrm{age})\,\leqslant 9.25$ , while theisoc_fgk_lt2 sample is produced from the more inclusive criterion $\mathrm{log}(\mathrm{age})\,\leqslant 9.25$ . As isochrone ages for hotter stars are more reliable than those of cooler stars, theisoc_fg_lt2 sample combines the criteria $\mathrm{log}(\mathrm{age})\,\leqslant 9.25$ andT_eff > 5500 K. Finally, thegyro_gk_lt3 sample combines the common cuts with the following criteria:T_eff < 6000 K, RUWE < 1.4, high-reliability rotation periods (reliability flag of 3), and positions in theT_eff –P_rot plane between the empirical Pleiades and NGC 6819 + Ruprecht 147 gyrochrones of Curtis et al. (2020). The distributions of these planetary samples in the P-R and insolation–radius planes are depicted in Figure6.

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.** Age-selected samples of planets in the P-R (left column) and insolation–radius (right column) planes. Age selections are described in Section2. Previously determined equations for the radius valley are shown as black lines. Point colors indicate the classification used in the SVM analysis (described in Section3.3).
Download figure:
Standard image High-resolution image

Following the approach ofV18, we used support vector machines (SVMs) to find the decision boundary that maximizes the margins between two distinct classes of planets in the P-R and insolation–radius planes. To label the planets, we found that shifting theV18 radius valley equation downward by 0.07 dex in ${\mathrm{log}}_{10}({R}_{P}/{R}_{\oplus })$ provided an unambiguous separation of planets into two classes for theisoc_fgk_1to2 sample. Thus, we used the equation ${\mathrm{log}}_{10}({R}_{P}/{R}_{\oplus })=-0.09{\mathrm{log}}_{10}(P/{\rm{d}})\,+0.3$ to label planets as sub-Neptunes or super-Earths. To implement the SVM classification, we used thesklearn.svm.SVC module in Python with a linear kernel (Pedregosa et al.2011).

We explored the sensitivity of our results to the regularization parameter, ${ \mathcal C }$ , finding that for ${ \mathcal C }\lt 5$ , the SVM misclassifies a large fraction of planets and fails to trace the center of the void that is so readily visible by eye (Figure7). In determining the equation of the void, we ultimately adopt the slope and intercept derived from the ${ \mathcal C }=10$ case but recommend ${ \mathcal C }=1000$ for determining the upper and lower boundaries of the void. To calculate the uncertainties on the slope and intercept of the radius valley, we performed 10³ bootstrapping simulations, selecting 50 planets (with replacement) randomly from the young planet samples and recording the slope and intercept resulting from the SVM classification for each bootstrapped sample.

Figure 7. Refer to the following caption and surrounding text. — **Figure 7.** Effect of the regularization parameter, ${ \mathcal C }$ , on the support vector classification for young CKS planets (the`isoc`_`fgk`_`1to2` sample) in the P-R diagram. In each panel, the point colors indicate the planet classification provided in the SVM classification analysis. Points circled in black indicate the support vectors. The solid lines indicate the decision surface of maximal separation, while the dashed lines indicate the margins (as discussed in the text).
Download figure:
Standard image High-resolution image

Table2 lists the slopes and intercepts for the young planet void inferred from the SVM bootstrapping simulations. Figures8 and9 show the derived radius valley from the bootstrapping simulations, and Figure10 shows the distributions of slopes and intercepts from this analysis. For the ${ \mathcal C }$ values explored here, the inferred slopes and intercepts of the radius valley are relatively constant. We find in almost all cases that the slope of the valley is consistent with the slope found inV18 at the ≲1σ level. However, we find an intercept that is systematically smaller than that found byV18 and Martinez et al. (2019) by at least 2σ and, in some cases, as much as 10σ using the quoted uncertainties from those works. While the statistical significance of this difference is highly dependent on the adopted uncertainty (where ours appears to be generally larger), it is clear from Figure6 that the void we observe is offset from previous determinations of the radius valley. We note that previous works characterized the radius valley among samples with a broader range of ages, while the focus of this analysis is on the younger planets in the CKS sample. In Section4, we discuss our interpretation of this difference. The level of agreement between the radius valley slopes derived here and inV18 is noteworthy, given that the samples we characterize are ≈30%–100% larger and selected on the basis of age rather than radius precision, which was the impetus for theV18 sample.

Figure 8. Refer to the following caption and surrounding text. — **Figure 8.** The P-R diagram for planets in the`isoc`_`fgk`_`1to2` (first row),`isoc`_`fgk`_`lt2` (second row),`isoc`_`fg`_`lt2` (third row), and`gyro`_`gk`_`lt3`(fourth row) samples. Point colors indicate the classifications used in the SVM analysis. The gray line and shaded region show the median and 16th–84th percentile width of the radius valley from the SVM bootstrapping simulations. The dashed lines indicate the median margins from the SVM analysis. The regularization parameter, ${ \mathcal C }$ , is indicated at the top of each panel.
Download figure:
Standard image High-resolution image

Figure 9. Refer to the following caption and surrounding text. — **Figure 9.** Same as Figure8 but for the insolation–radius plane.
Download figure:
Standard image High-resolution image

Figure 10. Refer to the following caption and surrounding text. — **Figure 10.** Gaussian kernel density estimation of the distribution of radius valley slopes (α) and intercepts (β) from the SVM bootstrapping simulations with different regularization ( ${ \mathcal C }$ ) parameters. The circles with error bars indicate the values derived in Van Eylen et al. (2018) from planets orbiting asteroseismic stars. The squares with error bars indicate the values derived by Martinez et al. (2019) from an independent spectroscopic analysis of the CKS sample.
Download figure:
Standard image High-resolution image

Table 2. Results of SVM Bootstrapping Simulations

Sample	${ \mathcal C }$	α	β	γ	δ		ζ
`isoc`_`fgk`_`1to2`	5	$-{0.09}_{-0.06}^{+0.06}$	${0.33}_{-0.07}^{+0.06}$	${0.14}_{-0.01}^{+0.01}$	${0.06}_{-0.05}^{+0.04}$	${0.13}_{-0.06}^{+0.12}$	${0.14}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`1to2`	10	$-{\bf{0}}.{{\bf{08}}}_{-0.04}^{+0.06}$	${\bf{0}}.{{\bf{31}}}_{-0.05}^{+0.05}$	${0.11}_{-0.01}^{+0.01}$	${\bf{0}}.{{\bf{06}}}_{-0.04}^{+0.03}$	${\bf{0}}.{{\bf{13}}}_{-0.04}^{+0.11}$	${0.11}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`1to2`	100	$-{0.06}_{-0.04}^{+0.02}$	${0.29}_{-0.04}^{+0.02}$	${0.07}_{-0.01}^{+0.01}$	${0.04}_{-0.03}^{+0.02}$	${0.15}_{-0.06}^{+0.04}$	${0.07}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`1to2`	1000	$-{0.06}_{-0.04}^{+0.02}$	${0.29}_{-0.03}^{+0.03}$	${0.06}_{-0.01}^{+0.01}$	${0.04}_{-0.01}^{+0.04}$	${0.15}_{-0.08}^{+0.04}$	${0.05}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`lt2`	5	$-{0.1}_{-0.05}^{+0.07}$	${0.32}_{-0.06}^{+0.06}$	${0.14}_{-0.01}^{+0.01}$	${0.08}_{-0.05}^{+0.04}$	${0.07}_{-0.09}^{+0.1}$	${0.14}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`lt2`	10	$-{0.08}_{-0.05}^{+0.05}$	${0.3}_{-0.07}^{+0.04}$	${0.11}_{-0.01}^{+0.01}$	${0.06}_{-0.04}^{+0.04}$	${0.11}_{-0.07}^{+0.08}$	${0.11}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`lt2`	100	$-{0.07}_{-0.04}^{+0.04}$	${0.29}_{-0.05}^{+0.04}$	${0.06}_{-0.01}^{+0.01}$	${0.04}_{-0.03}^{+0.04}$	${0.15}_{-0.06}^{+0.06}$	${0.06}_{-0.01}^{+0.01}$
`isoc`_`fgk`_`lt2`	1000	$-{0.08}_{-0.03}^{+0.07}$	${0.3}_{-0.05}^{+0.04}$	${0.05}_{-0.02}^{+0.01}$	${0.05}_{-0.04}^{+0.02}$	${0.13}_{-0.07}^{+0.08}$	${0.05}_{-0.02}^{+0.01}$
`isoc`_`fg`_`lt2`	5	$-{0.12}_{-0.06}^{+0.06}$	${0.32}_{-0.05}^{+0.05}$	${0.14}_{-0.02}^{+0.01}$	${0.09}_{-0.05}^{+0.05}$	${0.02}_{-0.11}^{+0.15}$	${0.15}_{-0.02}^{+0.01}$
`isoc`_`fg`_`lt2`	10	$-{0.08}_{-0.04}^{+0.05}$	${0.3}_{-0.04}^{+0.05}$	${0.11}_{-0.01}^{+0.01}$	${0.07}_{-0.04}^{+0.03}$	${0.08}_{-0.05}^{+0.11}$	${0.11}_{-0.01}^{+0.01}$
`isoc`_`fg`_`lt2`	100	$-{0.05}_{-0.05}^{+0.03}$	${0.27}_{-0.04}^{+0.04}$	${0.06}_{-0.01}^{+0.01}$	${0.06}_{-0.04}^{+0.02}$	${0.11}_{-0.05}^{+0.08}$	${0.07}_{-0.01}^{+0.01}$
`isoc`_`fg`_`lt2`	1000	$-{0.07}_{-0.04}^{+0.04}$	${0.28}_{-0.04}^{+0.04}$	${0.05}_{-0.01}^{+0.02}$	${0.07}_{-0.03}^{+0.04}$	${0.07}_{-0.09}^{+0.08}$	${0.06}_{-0.01}^{+0.01}$
`gyro`_`gk`_`lt3`	5	$-{0.09}_{-0.04}^{+0.08}$	${0.3}_{-0.08}^{+0.04}$	${0.14}_{-0.01}^{+0.01}$	${0.06}_{-0.04}^{+0.04}$	${0.11}_{-0.07}^{+0.09}$	${0.14}_{-0.01}^{+0.01}$
`gyro`_`gk`_`lt3`	10	$-{0.06}_{-0.04}^{+0.06}$	${0.28}_{-0.05}^{+0.05}$	${0.11}_{-0.01}^{+0.01}$	${0.05}_{-0.03}^{+0.03}$	${0.14}_{-0.05}^{+0.06}$	${0.11}_{-0.01}^{+0.01}$
`gyro`_`gk`_`lt3`	100	$-{0.05}_{-0.03}^{+0.04}$	${0.27}_{-0.05}^{+0.03}$	${0.06}_{-0.01}^{+0.01}$	${0.03}_{-0.04}^{+0.02}$	${0.17}_{-0.05}^{+0.06}$	${0.06}_{-0.01}^{+0.01}$
`gyro`_`gk`_`lt3`	1000	$-{0.03}_{-0.06}^{+0.02}$	${0.26}_{-0.04}^{+0.04}$	${0.05}_{-0.01}^{+0.02}$	${0.02}_{-0.02}^{+0.05}$	${0.17}_{-0.07}^{+0.07}$	${0.05}_{-0.02}^{+0.01}$

Note.The equation for the radius valley in the P-R diagram is of the form ${\mathrm{log}}_{10}({R}_{P}/{R}_{\oplus })=\alpha {\mathrm{log}}_{10}(P/{\rm{d}})+\beta$ . In the insolation–radius diagram, it is ${\mathrm{log}}_{10}({R}_{P}/{R}_{\oplus })=\delta {\mathrm{log}}_{10}({S}_{\mathrm{inc}}/{S}_{\oplus })+\epsilon$ . Adopted values are in bold.

Download table as: ASCII Typeset image

The upper and lower boundaries of the radius valley are given by the equation

in the P-R plane or

in the insolation–radius plane. We use the highest ${ \mathcal C }$ parameter explored here for determination of the radius valley boundaries, as it provides the closest match to the data (i.e., the fewest planets inside those boundaries).

3.4. Calculation of False-alarm Probability

To determine whether the void we observe in the P-R diagram could be due to chance, we performed simulations to determine the probability of findingn_itv or fewer planets in the void fromN planets selected at random (without replacement) from the CKS sample. HereN is the total number of planets in each of the samples described in Section2, i.e., 156, 238, 124, and 190 for theisoc_fgk_1to2,isoc_fgk_lt2,isoc_fg_lt2, andgyro_gk_lt3 samples, respectively. The definition of the radius valley boundaries and hence the true number of planets in the valley,n_itv,true, are sensitive to the regularization parameter, ${ \mathcal C }$ , and the specific sample used in the SVM bootstrapping analysis (see Figure11). For each sample and ${ \mathcal C }$ value, we performed 10⁴ simulations, selectingN planets randomly (without replacement) from the overall CKS sample, modeling the planet period and radius uncertainties with normal distributions, and recording the number of planets in the valley, ${n}_{\mathrm{itv},\mathrm{sim}}$ . The false-alarm probability was then computed as the fraction of total trials that satisfied the condition ${n}_{\mathrm{itv},\mathrm{sim}}\leqslant {n}_{\mathrm{itv},\mathrm{true}}$ . The results of these simulations are tabulated in Table3. For ${ \mathcal C }=5$ , we find false-alarm probabilities in the range of 18%–30%, but from Figure11, it is clear that the SVM margins in this case are so wide as to not provide an accurate description of the void boundaries. The same may be argued for the ${ \mathcal C }=10$ case, but even then we find false-alarm probabilities of <10%. Finally, for the ${ \mathcal C }=100,1000$ cases, for which the SVM margins precisely trace the boundaries of the void, we found false-alarm probabilities of ≪1%.

Figure 11. Refer to the following caption and surrounding text. — **Figure 11.** Effect of the regularization parameter, ${ \mathcal C }$ , on the boundaries of the radius valley (indicated by the dashed lines) and hence the number of planets in the valley (dark points). The samples from top to bottom are`isoc`_`fgk`_`1to2`,`isoc`_`fgk`_`lt2`,`isoc`_`fg`_`lt2`, and`gyro`_`gk`_`lt3`.
Download figure:
Standard image High-resolution image

Table 3. False-alarm Probabilities

Sample	${ \mathcal C }$ =5	${ \mathcal C }$ =10	${ \mathcal C }$ =100	${ \mathcal C }$ =1000
`isoc`_`fgk`_`1to2`	30%	3.0%	0.09%	<0.01%
`isoc`_`fgk`_`lt2`	18%	6.5%	0.08%	0.01%
`isoc`_`fg`_`lt2`	15%	14%	0.2%	0.1%
`gyro`_`gk`_`lt3`	22%	4.5%	0.03%	0.08%

Download table as: ASCII Typeset image

We also performed Monte Carlo simulations to estimate the probability that the void could be produced from random selection among those planets orbiting hosts rotating more slowly than the empirical 2.7 Gyr gyrochrone (see Figure4). We started by performing the same generic cuts on the overall CKS sample as were performed on the rotation-selected sample (main-sequence stars, no false positives, RUWE < 1.4,T_eff < 6000 K, and planets withP < 100 days,R_P < 10R_⊕, and radius precision <20%). Then, in 10⁴ simulations, we modeled the uncertainties in the planet periods and radii using normal distributions, selected 190 planets at random (without replacement) from the slow rotator sample, and computed the number of planets in the void. Here 190 is the total number of planets in the fast rotator sample. We found that the probability of finding a comparably empty void from the slow rotator sample is <2% when using thegyro_gk_lt3 margins and ${ \mathcal C }=100,1000$ . Using the same ${ \mathcal C }$ values but margins derived from the isochrone samples raises the false-alarm probability, but only to ∼2%–6% at most. Taken together, we conclude that the emptiness of the observed void is not due to chance.

3.5. Effects of Stellar Mass and Metallicity

Our analysis separates the data set into age bins in order to understand the evolution of planets on a population level. Since age, mass, and metallicity are correlated in the CKS sample, it is difficult to entirely disentangle the effects of each parameter on the distribution of planets in the P-R diagram. We explored how sensitive our analysis is to specific binning schemes by recording the fractional number of planets in the valley over a 2D grid of bin centers and widths in age, mass, and metallicity. We used the definition of the radius valley boundaries expressed in Section3.3 for this purpose (specifically, the margins given by the ${ \mathcal C }=1000$ ,isoc_fgk_1to2 sample case).

Figure12 shows the results of this exercise; the young planet void is apparent as the light, broad diagonal stripe in the left panel. For small bin widths, the minimum density of the radius valley is achieved for a bin center in the range 9 < log(age) < 9.25. This is unsurprising, as the boundaries of the radius valley were identified in and derived for just such a binning strategy. However, as the bin width in log(age) increases, the minimum density of the valley shifts systematically toward younger bin centers. This suggests that the filling of the radius valley is due to preferentially older planets. Figure12 also reveals that there is no binning strategy in mass or metallicity that can produce a comparably empty void (in a fractional sense) except in finely tuned regions of parameter space where sample sizes are small. However, while not as pronounced, we do note that the radius valley (as defined in this exercise) appears emptier for lower-mass and metal-rich stars. The latter observation is consistent with the finding of an apparently wider radius valley for metal-rich hosts within the CKS sample (Owen & Murray-Clay2018), though we note that age and [Fe/H] are anticorrelated.

Figure 12. Refer to the following caption and surrounding text. — **Figure 12.** Effects of binning schemes (in age, mass, and metallicity, from left to right) on the fractional occupancy of the radius valley. Note that the color scaling is the same for each panel.
Download figure:
Standard image High-resolution image

We also examined the 1D radius and period distributions for CKS planets in the extremes of the stellar age, mass, and metallicity axes (Figure13). The purpose of this exercise was to highlight any exoplanet demographic trends as a function of these key stellar parameters. Interestingly, even without completeness corrections or occurrence rate calculations, several of the now well-established trends in the Kepler planet population are evident from Figure13: larger sub-Neptunes around more massive (Fulton & Petigura2018; Wu2019; Cloutier & Menou2020) and metal-rich (Petigura et al.2018) stars, the rising occurrence of ultrashort-period (P < 1 day) planets with decreasing stellar mass (Sanchis-Ojeda et al.2014), and the rising occurrence of short-period planets (1 day <P < 10 days) of all sizes with increasing metallicity (Petigura et al.2018). The other trend that is apparent is the dearth of planets in the radius valley for young stars. The trend of a wider radius valley around more metal-rich stars found by Owen & Murray-Clay (2018) is not immediately obvious, but relative to that study, we use updated planetary radii fromF18, do not include completeness corrections, and perform slightly different cuts.

Figure 13. Refer to the following caption and surrounding text. — **Figure 13.** Planet radius (top row) and period (bottom row) distributions among our CKS base sample for host stars of different age (left column), mass (middle column), and metallicity (right column).
Download figure:
Standard image High-resolution image

3.6. Accounting for Age Uncertainties

In Section3.4, we assessed the probability that the observed void was due to a chance selection of planets from the overall CKS data set, and in Section3.5, we explored the sensitivity of the void occupancy to binning schemes in mass, age, and metallicity. Here we attempt to account for stellar age uncertainties in examining the void occupancy as a function of age. As discussed in AppendixB, there may be substantial uncertainties in stellar ages, particularly if those ages originate from isochrones. As a result, when binning in stellar age, there is considerable uncertainty in the degree of contamination by stars with inaccurate ages.

To mitigate the effect of stars with spurious ages, we performed Monte Carlo simulations in which the ages were modeled as normal distributions in log(age) centered on the median values published inF18 with widths taken as the maximum of the lower and upper age uncertainties for each star. While this is not the same as drawing from the empirical posterior probability density functions in age (which are not available), it is a crude proxy. For 50 bin centers in log(age) from 8.25 to 10.25, we then performed 10³ Monte Carlo simulations in each bin, randomly generating ages as described above, to measure the fraction of planets in the valley as a function of age. We measured the fraction of planets in the valley relative to the total number of planets in each age bin for bin widths of 0.125, 0.25, 0.5, and 1.0 dex. The results of this analysis are shown in Figure14. The scarcity of planets with log(age) < 9 and >10 leads to large uncertainties in the trend at both extremes, in addition to the larger age uncertainties at younger ages. However, we observe a marginally significant increase in the fraction of planets located in the radius valley in the range 8.75 < log(age) < 9.75. This is in agreement with Figures2 and3, which show that the radius valley appears weaker among the oldest planets in the CKS sample. Computing planet occurrence rates in the valley as a function of age might lead to a more robust conclusion on the trend noted here but is outside the scope of the present work.

Figure 14. Refer to the following caption and surrounding text. — **Figure 14.** Results of Monte Carlo simulations exploring the occupancy of the radius valley as a function of age, accounting for uncertainties in stellar ages. The dark line shows the median trend resulting from the simulations, while the shaded contours show the 68.3rd, 95.4th, and 99.7th percentile ranges. The curves have been smoothed with a Savitzky–Golay filter for clarity.
Download figure:
Standard image High-resolution image

3.7. In What Ways Are Planets in the Valley Different?

In an effort to quantify the parameters that are most important in contributing to the filling of the radius valley, we performed a k-sample Anderson–Darling (A-D) test (Scholz & Stephens1987) withscipy.stats.anderson_ksamp to test the null hypothesis that the distribution of a given variable for stars hosting planets in the valley was drawn from the same distribution of that parameter among our base CKS sample.

The results of this exercise are summarized in Table4, and select parameter distributions are shown in Figure15. In this exercise, we have assumed that the equation for the equations for the radius valley and its boundaries are given by the fourth row of Table2. This choice is motivated by the fact that higher regularization parameters correspond to tighter boundaries of the valley, offering a cleaner separation between planets in, above, or below the valley. Additionally, for each parameter, we restrict our analysis to those CKS stars/planets for which that parameter is defined (i.e., we exclude targets missing data for a given variable). We also apply the common cuts described in Section3.3 before performing the A-D tests. In the case ofP_rot, we also restricted the sample toT_eff < 6000 K, where rotation periods are more reliable indicators of age.

Figure 15. Refer to the following caption and surrounding text. — **Figure 15.** Comparison of 1D parameter distributions for planets in the valley versus the CKS base sample (as described in Section3.7).
Download figure:
Standard image High-resolution image

Table 4. Results of k-sample A-D Tests

Parameter	Ref.	A-D Test Stat.	A-Dp-value	Sample Size (Valley/Control)
${\sigma }_{{R}_{P}}/{R}_{P}$	F18	14.30	0.0010	196/1443
P_rot flag	D21	6.08	0.0015	196/1443
P_rot	M15	4.82	0.0040	135/1055
R_var	M15	4.67	0.0045	180/1334
log(age)	F18	4.37	0.0059	196/1443
R_*	F18	3.61	0.011	196/1443
r₈	F18	2.53	0.030	196/1443
P_rot	M13	2.17	0.042	36/371
P_rot	D21	1.95	0.051	55/592
P_rot	A18	0.67	0.17	109/873
S/N₁	D21	0.65	0.18	190/1420
T_eff	F18	0.62	0.18	196/1443
R_τ	P20	0.30	>0.25	190/1415
${\sigma }_{{R}_{\star }}/{R}_{\star }$	F18	0.25	>0.25	196/1443
M_*	F18	0.21	>0.25	196/1443
Parallax	F18	0.096	>0.25	196/1443
CDPP3	D21	−0.16	>0.25	190/1420
P_rot	W13	−0.19	>0.25	34/335
A_V	B20	−0.50	>0.25	188/1390
G mag	DR2	−0.56	>0.25	190/1420
A_V	L21	−0.57	>0.25	187/1382
[Fe/H]	F18	−0.63	>0.25	196/1443
RUWE	D21	−0.64	>0.25	190/1418
RCF	F18	−0.94	>0.25	55/423

Note. References: A18, Angus et al.2018; D21, this work; DR2, Gaia Collaboration et al.2018; F18, Fulton & Petigura2018; L21, Lu et al.2021; M15, Mazeh et al.2015; M13, McQuillan et al.2013; P20, Petigura2020; W13, Walkowicz & Basri2013.

Download table as: ASCII Typeset image

Of the nine parameters with the highest normalized k-sample A-D test statistics (andp-values <0.05), all but two pertain directly or indirectly to the star’s evolutionary state:P_rot from various sources,P_rot flag,R_var (a measure of the photometric variability amplitude), median posterior age from isochrones, andR_*. The other two parameters are fractionalR_P precision andr₈ (a measure of flux dilution). Thus, of the parameters investigated, those that contribute most to the filling of the radius valley either relate to stellar age or may be associated with erroneous measurements of the planetary radii. Inspection of the parameter distributions (like those shown in Figure15) reveals that planets in the valley tend to orbit stars that are older, larger, less likely to have a securely detected rotation period, rotating more slowly, and photometrically quieter. Planets in the valley also have lowerr₈ values relative to the CKS base sample. Naively, one might expect higherr₈ values among planets in the radius valley, as flux dilution can lead to erroneous planet radius measurements. However, planets in the radius valley are by definition small, so it is perhaps not surprising that there is a preference for stars not affected by crowding.

We also found that the stellar mass and metallicity distributions for stars hosting planets in the radius valley are statistically indistinguishable from those of the CKS base sample. This lends further support to the notion that the feature identified in the CKS data set from age selections is not due to correlations between stellar age, mass, and metallicity.

The parameter that appears to be most important in contributing to the filling of the radius valley is planet radius precision. This suggests that the radius valley may be emptier than is suggested by current data. The fractional stellar radius precision does not, however, contribute to the filling of the valley. This is not surprising, as the typical error budget for a planet’s radius in the CKS sample is dominated by theR_P/R_⋆ uncertainty from light-curve fitting rather than the stellar radius uncertainty (Petigura2020). In Section3.8, we examine the possibility that a correlation between planet radius precision and age could conspire to produce the observed void.

3.8. Confounding Scenarios

A possibility not yet explored is that the radius valley is inherently empty but, for some reason, planets orbiting stars with younger assigned ages in the CKS sample have more precise radii. In Section3.7, we established that the planet radius precision is the most important parameter contributing to the filling of the radius valley. We quantified the correlation between fractional planet radius precision and log(age) by computing the Spearman rank correlation coefficient for planets withP < 100 days,R_P < 10R_⊕, non-false-positive dispositions, and main-sequence host stars (filters 1, 2, 4, and 5 from Section2). We used thescipy.stats.spearmanr function for this purpose and found a smallp-value (2 × 10⁻⁴) but a very weak correlation coefficient (ρ < 0.1).

To further investigate the impact of radius precision, we computed the fraction of planets in the valley for young and old samples as a function of fractional radius precision allowed. For a given sample of planets and over a grid of radius precision thresholds, we selected the planets with fractional radius uncertainties smaller the threshold value and computed the ratio of planets in the valley to the total number of planets meeting the radius precision requirement. We performed 10³ bootstrapping simulations (including modeling of the planet radii as normal distributions) to determine the uncertainties on these trends, which are shown in Figure16. We found that the radius valley is comparably empty for young and old planets if the fractional radius precision is required to be better than ∼5%. However, this is not unexpected, as our CKS base sample size diminishes steeply below fractional radius uncertainties of 7%. For reference, the median fractional radius uncertainties for the CKS base sample, young isochrone age-selected sample (isoc_fgk_lt2 ), and old isochrone age-selected sample are 5.0%, 4.5%, and 5.2%, respectively.

Figure 16. Refer to the following caption and surrounding text. — **Figure 16.** Fraction of planets in the radius valley as a function of maximum fractional radius uncertainty. For each age-selected sample shown, the fraction of planets in the valley is computed for the subsample of planets with fractional radius uncertainties lower than the value on the ordinate. Solid lines and shaded bands show the median and 16th–84th percentile range from bootstrapping simulations, respectively. In each panel, the fiducial dashed line shows the value of fractional radius uncertainty for which both young and old samples contain more than 150 planets (corresponding to 21 ± 4 expected planets in the valley if selected at random from the CKS base sample).
Download figure:
Standard image High-resolution image

Finally, we examined the 1D radius distributions for the young and old isochrone age-selected samples with planet radii known to better than 5%. We performed 10³ bootstrapping simulations (again modeling the planet radii as normal distributions) to determine the uncertainties on these distributions. The results are shown in Figure17. While radius precision is clearly an important parameter in determining the occupancy of the radius valley, we conclude that it is unlikely to explain the entire deficit observed for the young planet sample.

Figure 17. Refer to the following caption and surrounding text. — **Figure 17.** Small planet size distributions among CKS planets with fractional radius uncertainties better than 5%. Uncertainties (the 16th and 84th percentiles) are determined from bootstrapping simulations with planet radii modeled as normal distributions.
Download figure:
Standard image High-resolution image

3.9. Is the Radius Gap Empty?

Figures2,3, and5 give the impression that the radius valley progressively fills in over time, becoming weaker or disappearing entirely among older planet populations. This interpretation appears to be at odds with the results ofV18, who observed a clean gap in the P-R distribution of planets orbiting asteroseismic host stars, which are preferentially older than the stars in our young sample.¹⁴ Notably, planets with ages ≳3 Gyr in the CKS base sample have a median radius precision of 5.3%, while planets in theV18 sample have a median radius precision of 3.3%. Similarly, in Section3.8, we found that age-dependent differences in the radius valley filling factor can be resolved at least partially by restricting the analysis to planets with the most precise radii.

To further investigate this issue, we constructed a new sample, thegold sample, which implements several reliability cuts. In addition to the cuts of the base sample, thegold sample is restricted to planets with a fractional radius precision of <6%, a fractionalR_P/R_* precision of <6%, nongrazing transits (R_τ > 0.6), RCF < 1.05,A_V < 0.5 mag, RUWE < 1.1, agreement between theF18 isochrone-derived and trigonometric parallaxes, and a KOI reliability score >0.99 from the Q1-Q17 DR25 catalog. We then split this sample into young,gold_lt3, and old,gold_gt3, samples. The young sample includes the restriction that the stellar age inferred from both isochrones and gyrochronology is <3 Gyr, while the old sample requires a planet host to have a median isochrone age >3 Gyr and does not have a rotation period consistent with an age <3 Gyr. For both thegold_lt3 andgold_gt3 samples, we confirmed that the corresponding distributions inR_P precision,R_P/R_* precision, and single-transit S/N (defined as ${({R}_{P}/{R}_{* })}^{2}/\mathrm{CDPP}3$ ) were not statistically different either from each other or from the overall distributions in thegold sample, yieldingp-values > 0.25 in each case from k-sample A-D tests. Thegold_lt3 sample in comparison to theV18 andgold samples in the P-R diagram are shown in Figure18. From that figure, it appears that the reliability cuts have a significant impact on how well defined the super-Earth and sub-Neptune distributions are, as well as how empty the gap appears at all ages, though it is not entirely devoid of planets. Furthermore, it is clear from Figure18 that the gap in thegold_lt3 sample is offset from the gap in both theV18 andgold samples, indicating that the difference is unlikely to be due to systematic differences in planet radii between the two studies. For reference, we also show the P-R distribution ofV18 planets using the CKS radii, which highlights the importance of the precise (R_P/R_⋆) values used byV18 in resolving the gap (for a detailed discussion, see Petigura2020).

Figure 18. Refer to the following caption and surrounding text. — **Figure 18.** Planet distributions in the P-R diagram for the samples described in Section3.9. In the middle panel, planets in theV18 asteroseismic sample are plotted using theF18 radii. Lines connect each planet in that sample to its radius as determined byV18.
Download figure:
Standard image High-resolution image

We proceeded to perform the same SVM analysis as was presented in Section3.3 with one difference: classification of the samples into super-Earths and sub-Neptunes was performed using the thresholdR_P = 1.8R_⊕ rather than a period-dependent classification scheme. The reason for this choice is because this scheme clearly works well for thegold_lt3 sample and allows us to test the sensitivity of the analysis to the classification step. The results of our analysis are presented in Figure19 and Table5. We find that despite the simplified classification scheme, a negative slope in the P-R diagram is still preferred (though the data are also consistent with no orbital period dependence). Furthermore, we observe that, independent of regularization parameter, there is a persistent offset in the center of the gap for thegold_lt3 sample, compared to both thegold andgold_gt3 samples.

Figure 19. Refer to the following caption and surrounding text. — **Figure 19.** Smoothed Gaussian kernel density estimates of the distributions of the intercepts (left) and slopes (right) resulting from the SVM bootstrapping analysis of the CKS`gold` samples presented in Section3.9. At left, theR₁₀ parameter indicates the center of the radius gap at an orbital period of 10 days. Each row corresponds to a different regularization parameter, ${ \mathcal C }$ , indicated in the figure legend. At right, vertical lines indicate predictions from the impact erosion (IE; Wyatt et al.2020), photoevaporation (PE; Lopez & Rice2018), core-powered mass-loss (CPML; Gupta & Schlichting2019), and gas-poor formation (GPF; Lopez & Rice2018) theories.
Download figure:
Standard image High-resolution image

Table 5. Results of SVM Bootstrapping Simulations for the CKS Gold Samples

Sample	${ \mathcal C }$	α	β	γ	δ		ζ
`gold`	5	$-{0.18}_{-0.07}^{+0.09}$	${0.42}_{-0.07}^{+0.07}$	${0.15}_{-0.02}^{+0.01}$	${0.11}_{-0.05}^{+0.05}$	${0.02}_{-0.07}^{+0.13}$	${0.15}_{-0.02}^{+0.01}$
`gold`	10	${\boldsymbol{-}}{{\bf{0.12}}}_{{\boldsymbol{-}}{\bf{0.06}}}^{{\boldsymbol{+}}{\bf{0.07}}}$	${{\bf{0.37}}}_{{\boldsymbol{-}}{\bf{0.06}}}^{{\boldsymbol{+}}{\bf{0.05}}}$	${{\bf{0.11}}}_{{\boldsymbol{-}}{\bf{0.01}}}^{{\boldsymbol{+}}{\bf{0.01}}}$	${{\bf{0.07}}}_{{\boldsymbol{-}}{\bf{0.04}}}^{{\boldsymbol{+}}{\bf{0.04}}}$	${{\bf{0.11}}}_{{\boldsymbol{-}}{\bf{0.08}}}^{{\boldsymbol{+}}{\bf{0.11}}}$	${{\bf{0.11}}}_{-{\bf{0.01}}}^{{\boldsymbol{+}}{\bf{0.01}}}$
`gold`	100	$-{0.05}_{-0.04}^{+0.06}$	${0.31}_{-0.05}^{+0.04}$	${0.06}_{-0.01}^{+0.01}$	${0.03}_{-0.04}^{+0.03}$	${0.19}_{-0.05}^{+0.1}$	${0.06}_{-0.01}^{+0.01}$
`gold`	1000	$-{0.02}_{-0.05}^{+0.05}$	${0.28}_{-0.06}^{+0.04}$	${0.03}_{-0.01}^{+0.01}$	${0.01}_{-0.04}^{+0.02}$	${0.24}_{-0.07}^{+0.05}$	${0.03}_{-0.01}^{+0.01}$
`gold`_`gt3`	5	$-{0.22}_{-0.11}^{+0.09}$	${0.49}_{-0.1}^{+0.1}$	${0.16}_{-0.02}^{+0.02}$	${0.15}_{-0.06}^{+0.04}$	$-{0.04}_{-0.11}^{+0.12}$	${0.16}_{-0.02}^{+0.02}$
`gold`_`gt3`	10	$-{0.13}_{-0.08}^{+0.09}$	${0.4}_{-0.09}^{+0.07}$	${0.12}_{-0.02}^{+0.01}$	${0.09}_{-0.06}^{+0.03}$	${0.06}_{-0.08}^{+0.13}$	${0.12}_{-0.02}^{+0.01}$
`gold`_`gt3`	100	$-{0.03}_{-0.04}^{+0.05}$	${0.29}_{-0.05}^{+0.04}$	${0.06}_{-0.01}^{+0.01}$	${0.03}_{-0.03}^{+0.03}$	${0.2}_{-0.06}^{+0.08}$	${0.06}_{-0.01}^{+0.01}$
`gold`_`gt3`	1000	$-{0.01}_{-0.03}^{+0.04}$	${0.27}_{-0.04}^{+0.04}$	${0.04}_{-0.01}^{+0.01}$	${0.01}_{-0.02}^{+0.03}$	${0.24}_{-0.06}^{+0.04}$	${0.04}_{-0.01}^{+0.01}$
`gold`_`lt3`	5	$-{0.14}_{-0.07}^{+0.06}$	${0.34}_{-0.05}^{+0.06}$	${0.15}_{-0.02}^{+0.01}$	${0.07}_{-0.04}^{+0.03}$	${0.08}_{-0.05}^{+0.11}$	${0.14}_{-0.02}^{+0.01}$
`gold`_`lt3`	10	$-{0.1}_{-0.04}^{+0.03}$	${0.33}_{-0.05}^{+0.03}$	${0.11}_{-0.01}^{+0.01}$	${0.06}_{-0.02}^{+0.02}$	${0.11}_{-0.04}^{+0.04}$	${0.11}_{-0.01}^{+0.01}$
`gold`_`lt3`	100	$-{0.05}_{-0.05}^{+0.02}$	${0.28}_{-0.03}^{+0.04}$	${0.07}_{-0.01}^{+0.01}$	${0.03}_{-0.01}^{+0.06}$	${0.18}_{-0.13}^{+0.02}$	${0.07}_{-0.01}^{+0.01}$
`gold`_`lt3`	1000	$-{0.05}_{-0.04}^{+0.02}$	${0.27}_{-0.03}^{+0.02}$	${0.06}_{-0.01}^{+0.01}$	${0.03}_{-0.01}^{+0.06}$	${0.16}_{-0.13}^{+0.01}$	${0.06}_{-0.01}^{+0.01}$

Download table as: ASCII Typeset image

Related to this last point, we emphasize that the gap identified in this work is primarily due to a lack of large super-Earths at young ages, as opposed to a difference in the sub-Neptune size distribution or some combination of the two. This is most apparent in Figures17 and18. We note that the young planet samples are always smaller than the control samples, and the dearth of large super-Earths at young ages could be due in part to small number statistics. To assess the probability of this scenario, we performed 10⁴ simulations and measured the fraction of outcomes in which the number of >1.5R_⊕ planets in a control sample was equal to or fewer than the number of >1.5R_⊕ planets in the young sample. In each simulation, we selected 40 planets at random from the control sample, corresponding to the size of our young super-Earth sample. We accounted for planet radius uncertainties in both the young and control samples by modeling the radii as normal distributions given their published uncertainties. For the control samples, we used the CKS gold andV18 samples, where we used theV18 radius valley equation to select only the super-Earths in each. In both cases, we found that ≲2% of the simulations resulted in outcomes where the number of >1.5R_⊕ super-Earths was greater in the young sample than in the control sample. We also compared the young and control super-Earth size distributions with a k-sample A-D test, finding that the null hypothesis can be rejected at the 1% level.

In conclusion, we propose a solution to resolve the apparent tension described in the beginning of this section and to explain all of the observations to date: the radius gap is intrinsically empty, or at least emptier than previously appreciated, but its precise location shifts with the age of the planetary population. Since the radius gap appears to have an orbital period dependence, a gap that is intrinsically empty in the P-R plane will always appear filled in when projected along the radius axis, even if the radii are perfectly known. Similarly, if the location of the gap also depends on host star mass, age, or metallicity, as has been suggested, then the gap will only appear empty in sufficiently narrow projections of parameter space. While this proposed solution would help explain some of our observations, we emphasize that we have not conclusively shown it to be the case. Confirming or rejecting this hypothesis may be possible with (1) a larger sample providing sufficient coverage of the P-R plane across the variables of interest and/or (2) a thorough, multivariate investigation of the radius gap in order to find the projection of the data resulting in the emptiest gap. We leave such an investigation to future works and emphasize that planet radius uncertainties (resulting from inaccurate light-curve fits, stellar radius uncertainties, or more pernicious sources, such as flux dilution from unresolved binaries) remain an obstacle to our understanding of the radius gap.

4. Discussion and Conclusions

We observe a nearly empty void in the P-R plane for close-in (P < 100 days) exoplanets orbiting stars younger than ∼2–3 Gyr. The void was first identified among a sample of planets with median posterior isochrone ages <1.8 Gyr but is also present among planets with stars rotating faster than an empirical 2.7 Gyr gyrochrone. The difference between these two timescales could conceivably be due to systematic offsets between the CKS isochrone ages and ages implied from a gyrochronology analysis. Because the gyrochrone used to perform our sample selection is calibrated to open clusters with main-sequence turnoff ages, the longer timescale may more accurately reflect the lifetime of this feature in the P-R diagram.

We derived equations for the center of this void, which we refer to as the young planet gap, as a function of orbital period,P, and insolation,S_inc:

For periods in the range of 3–30 days, describing the bulk of our sample, this places the center of the radius valley at 1.87–1.56R_⊕. Over this same period range, the lower boundary of the void is in the range of ∼1.6–1.4R_⊕, while the upper boundary is at ∼2.1–1.8R_⊕. From a subset of the CKS sample created using reliability and precision cuts, we similarly derived equations for the radius valley valid for all ages:

The slope of the young planet gap in the P-R diagram is consistent at the 1σ level with the slope of the radius valley measured from the asteroseismic sample inV18 but with an intercept that is smaller by ∼3σ using the uncertainty reported by those authors. The smaller intercept among the “young” planet sample corresponds to a shift in the radius valley toward smaller radii and would be compatible with a prolonged mass-loss timescale for the sub-Neptune progenitors of the largest observed super-Earths. An alternative explanation could be the late-time formation of secondary or “revived” atmospheres through endogenous or exogenous processes (e.g., Kite et al.2020; Kite & Barnett2020; Kite & Schaefer2021). Differentiating between these two hypotheses might be achieved with detailed composition modeling or atmospheric studies of the largest super-Earths.

The shallow, negative slope of the void is compatible with models of atmospheric loss through photoevaporation (e.g., Owen & Wu2013,2017; Jin & Mordasini2018; Lopez & Rice2018) or core cooling (Gupta & Schlichting2019,2020) but incompatible with the steeper negative slope implied for one model of impact-driven atmospheric erosion (Wyatt et al.2020). The negative slope we find is also incompatible with the positive slope predicted by models of late-stage formation in a gas-poor disk (Lopez & Rice2018). However, we note that the void is only marginally inconsistent with being flat when adopting a more conservative regularization parameter in the SVM analysis. The slope of this void in the insolation–radius plane is shallower (by 2σ–3σ) than the slope found by Martinez et al. (2019).

Both rotation-selected and isochrone-selected planet samples show the same qualitative trend: an absence of large super-Earths at young ages. We estimate that the probability of this feature being due to chance is <1% for both the isochrone-selected and rotation-selected samples. We also showed that this feature is relatively insensitive to various data reliability filters and unlikely to be the result of correlations between stellar age, mass, and metallicity. Simulations accounting for age and planet radius uncertainties show an increasing fraction of planets residing in this gap as a function of age (see Figure14). The occupancy of the radius valley is also clearly dependent on the precision of planetary radii, and we note that the differences between the young and old planetary samples diminish with more stringent precision requirements (Figure16). However, resolving the discrepancies entirely requires discarding more than half of the CKS sample. A larger sample size and higher-precision planetary radii for the entire CKS sample would help to more securely determine how much of the discrepancy between young and old planet populations is astrophysical and how much is due to noise.

While a more detailed study of planet radius demographics as a function of age, mass, and metallicity is left for future works, we note that our findings are broadly consistent with expectations from both the core-powered mass loss and photoevaporation theories. This is most evident from Figure20, where the sub-Neptune size trends with age, mass, and metallicity among those CKS planets with the most precise radii are shown in relation to scalings that approximately, though not exactly, mimic those presented in Gupta & Schlichting (2020). The slope in the stellar mass versus planet radius plane is shallower than predicted in core-cooling models but more similar to that predicted by photoevaporation models, provided that planet mass scales approximately linearly with stellar mass (Wu2019). The gigayear timescale we find for the evolution of the radius valley is more compatible with core-powered mass-loss models than the canonical timescale of 0.1 Gyr from photoevaporation. However, although photoevaporation models predict that the radius gap will emerge on a timescale of 0.1 Gyr, some small fraction of planets are expected to cross the gap on timescales of ∼1 Gyr or more (Rogers & Owen2021), which is compatible with our observations.

Figure 20. Refer to the following caption and surrounding text. — **Figure 20.** The 2D Gaussian kernel density estimates of the distribution of CKS planets in the age–radius (left), stellar mass–radius (middle), and metallicity–radius (right) planes. Our base sample is shown with the additional requirement of fractional radius uncertainties <6%. The light dashed lines indicate the nominal location of the radius valley. The dark dashed lines are not fits but are drawn as a visual guide, with their slopes indicated in the upper left corner of each panel.
Download figure:
Standard image High-resolution image

The difference in the radius distributions between young and old planets is primarily driven by an absence of large super-Earths (1.5–1.8R_⊕) at young ages, rather than an absence of small sub-Neptunes (see Figure17). As a result, the precise location of the radius valley is shifted to larger planet sizes at older ages. To better understand the compositions of planets missing from the young planet radius distribution, we compiled data for well-characterized, confirmed exoplanets from the NASA Exoplanet Archive (Akeson et al.2013). We selected planets with masses known to 25% precision or better, radii with 10% precision or better, orbital periods <100 days, and host stars with 4500 K <T_eff < 6500 K to match the CKS sample. We computed the bulk densities of these planets and compared the distribution of planets in the radius–density plane to composition curves from Zeng et al. (2019). Among these well-characterized planets, we observe a clean separation in the radius–density plane (also observed by Sinukoff2018) between planets that are consistent with rocky compositions and those that require a significant volatile component (such as an H₂–He atmosphere, H₂O-dominated ices/fluids, or some combination of the two) to explain their bulk compositions (Figure21). We also observed that the radius valley identified in Van Eylen et al. (2018) bridges the gap between planets in these two composition regimes. Meanwhile, the young planet gap identified in this work appears to correspond only to planets in the rocky composition regime. Thus, assuming the atmospheric loss hypothesis is correct, the planets that eventually fill the young planet gap may correspond to the large end of the size distribution of stripped cores.

Figure 21. Refer to the following caption and surrounding text. — **Figure 21.** Well-characterized exoplanets in the radius–density plane. Planet composition curves from Zeng et al. (2019) are shown. The beige line indicates an Earth-like rocky composition (32.5% Fe + 67.5% MgSiO₃), and the similarly shaded swath is bounded by the pure Fe (upper bound) and MgSiO₃ (lower bound) curves. Dotted lines indicate composition curves for Earth-like cores with H₂ atmospheres at 700 K for different atmospheric mass fractions, which are indicated above each curve. The vertical pink band indicates the range of planetary radii at the center of the young planet gap identified in this work for orbital periods in the 3–30 days range. The hatched band indicates the equivalent radius range for the same orbital periods of the gap identified in Van Eylen et al. (2018).
Download figure:
Standard image High-resolution image

This is an important point in the context of disentangling correlations between stellar mass, age, and metallicity in the CKS sample. The evidence for a wider radius valley among metal-rich stars is driven mostly by larger sub-Neptunes, on average, with one explanation being the decreased cooling efficiency of planets with higher-metallicity envelopes (Owen & Murray-Clay2018). By comparison, we observe a sub-Neptune size distribution that is relatively constant below 3 Gyr, while the average size of super-Earths appears to increase over this same time frame (Figure20). These observations are not easily explained by the anticorrelation between age and [Fe/H] in the CKS sample or the naive expectation of more massive cores around metal-rich stars from core accretion models. Given that a planet’s size is correlated with its mass, one physical interpretation for this observation is that the largest, most massive cores lose their atmospheres at later times.

It is also worth noting that a prolonged mass-loss timescale for some super-Earths might help to explain the rising occurrence of long-period super-Earths with decreasing metallicity observed by Owen & Murray-Clay (2018). Those authors noted that such planets are difficult to explain in the photoevaporation model and might have instead formed after the protoplanetary disk dispersed, akin to the canonical view of terrestrial planet formation in the solar system. However, we note that metallicity and age are correlated in the CKS sample, with the median age of the metal-poor sample in Owen & Murray-Clay (2018) being approximately 0.4 dex older than the metal-rich sample. If mass loss, regardless of the mechanism, proceeds over gigayear timescales, then one might expect a rising occurrence of super-Earths with increasing age (and hence decreasing metallicity).

In a companion paper, Sandoval et al. (2021) found tentative evidence that the fraction of super-Earths to sub-Neptunes rises with system age from ∼1 to 10 Gyr. That work accounted for uncertainties in stellar ages, planetary radii, and the equation for the radius valley itself. The result is in agreement with a previous finding by Berger et al. (2020a), who found that, among planets orbiting stars more massive than the Sun, the fraction of super-Earths to sub-Neptunes is higher among older stars (>1 Gyr) than it is for younger stars (<1 Gyr). Collectively, the present work and the studies mentioned above provide evidence for the evolution of small planet radii over gigayear timescales.

The code and data tables required to reproduce the analyses and figures presented in this paper are made publicly available.¹⁵

This paper is dedicated to the memory of John Stauffer, a valued mentor whose energy and determination were an inspiration to many. We thank the anonymous referee for a thorough and insightful review, as well as Eric Ford, Christina Hedges, David W. Hogg, and Josh Winn for helpful discussions. T.J.D. is especially grateful to Chelsea Yarnell for her irreplaceable support throughout the COVID-19 pandemic. This paper includes data collected by the Kepler mission, funded by the NASA Science Mission directorate. This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website ishttps://www.cosmos.esa.int/gaia. The Gaia archive website ishttps://archives.esac.esa.int/gaia. This work made use of the gaia-kepler.fun cross-match database created by Megan Bedell.

Facilities: Kepler - The Kepler Mission, Gaia - , GALEX. -

Software:astropy (Astropy Collaboration et al.2013,2018),jupyter (Kluyver et al.2016),matplotlib (Hunter2007),numpy (van der Walt et al.2011),pandas (pandas Development Team2020; Wes McKinney2010),seaborn (Waskom et al.2017),scikit-learn (Pedregosa et al.2011),scipy (Jones et al.2001).

Appendix A: Rotation Period Vetting Sheets

The rotation period vetting sheets (as described in Section2.1) are available in a Zenodo repository at doi:10.5281/zenodo.4645437. An example sheet is shown in Figure22.

Figure 22. Refer to the following caption and surrounding text. — **Figure 22.** Example rotation period vetting sheet. The Kepler light curve is phase-folded on periods determined from the literature (first row), as well as the first harmonic (second row) and subharmonic (third row). The period determinations of different authors (indicated at top) are presented in a column-wise fashion and color-coded for convenience. In the fourth row, the first 120 days of the light curve (left) and an L-S periodogram (right) are shown. In the fifth row, the full Kepler light curve is shown.
Download figure:
Standard image High-resolution image

Appendix B: Stellar Age Validation

As we are concerned with the time evolution of the exoplanet radius gap, our study hinges on the accuracy of the stellar ages. Main-sequence stars, which constitute the majority of Kepler planet hosts, typically have large age uncertainties; this is because the changes in a star’s observable properties over its main-sequence lifetime are small relative to the typical measurement uncertainties in those properties. However, the high precision of Gaia parallaxes and photometry has enabled the determination of relatively precise stellar ages from isochrones. The median lower and upper uncertainties on log(age) for stars in the CKS sample are 0.12 and 0.14 dex, respectively.

A true assessment of the accuracy of stellar ages is not possible; essentially all methods for stellar age determination are model-dependent, and benchmarks to calibrate these methods are lacking (Soderblom2010). However, because the Kepler field is so well studied, it is at least possible to determine the degree of agreement between isochrone ages published by different authors. It is also possible to determine the agreement between ages determined from isochrones versus those determined from gyrochronology or asteroseismology.

To validate the ages used in this study, we compared the isochrone age estimates fromF18 with those published in the Gaia–Kepler Stellar Properties Catalog (GKSPC; hereafterB20, Berger et al.2020a,2020b), the asteroseismic ages determined in Silva Aguirre et al. (2015), and the gyrochronology ages determined here. We note that while GKSPC ages exist for a far larger portion of the Kepler sample, we restrict our analysis here to only those stars that overlap with CKS VII.

Figure23 shows the comparison of ages and other parameters fromF18 andB20. For 80% of the stars with age estimates in both catalogs, the age estimates agree to within ∼0.4 dex. The median offset in ages is 0.075 dex, with theF18 ages being systematically older, but this shift is smaller than the typical age uncertainties from either catalog. The age discrepancies should not be due to differences in the adopted stellar models; bothF18 andB20 use theisoclassify package (Huber et al.2017) to compute ages from MIST v1.1 models (Choi et al.2016; Dotter2016).

Figure 23. Refer to the following caption and surrounding text. — **Figure 23.** Comparison ofF18 andB20 isochrone ages, stellar radii,T_eff, and [Fe/H] (from left to right). Residuals are show in the bottom row for each panel.
Download figure:
Standard image High-resolution image

To better understand the origins of the age discrepancies between the two studies, we searched for correlations between ${\rm{\Delta }}{\mathrm{log}}_{10}(\mathrm{age}\,{\mathrm{yr}}^{-1})$ and other parameters in the data set. We found that ${\rm{\Delta }}{\mathrm{log}}_{10}(\mathrm{age}\,{\mathrm{yr}}^{-1})$ is most strongly correlated with ΔM_* and ΔT_eff (see Figure24). As bothF18 andB20 determine mass and age simultaneously from stellar models, and these two parameters are intrinsically related, the ΔM_* – ${\rm{\Delta }}{\mathrm{log}}_{10}(\mathrm{age}\,{\mathrm{yr}}^{-1})$ correlation is unsurprising. The correlation with ΔT_eff, however, is more informative. HereF18 derivedT_eff from high-resolution spectroscopy, whileB20 derivedT_eff from isochrones using photometry (specifically Sloang and 2MASSK_s) and parallaxes as input.

Figure 24. Refer to the following caption and surrounding text. — **Figure 24.** Trends in age discrepancy between the CKS (F18) and GKSPC (B20) catalogs. Errors on the age differences are omitted for clarity.
Download figure:
Standard image High-resolution image

We examined the dependence ofT_eff –color relations on [Fe/H] andA_V and found, when using the CKS spectroscopic parameters, that [Fe/H] can explain most of the dispersion in theT_eff –color relations. By contrast, when using the photometricT_eff and [Fe/H] fromB20, there is no clear metallicity gradient in theT_eff –color relations.

We find that ΔT_eff is more strongly correlated with the reddening values (sourced from eitherB20, L21, or Gaia) than it is with any of the metallicity parameters. While reddening might help to explain temperature and age differences for some sources, we note that differences in photometric and spectroscopic temperature scales persist independent of reddening corrections (Pinsonneault et al.2012).

We next compared theF18 andB20 ages with those determined from precise asteroseismic parameters. Silva Aguirre et al. (2015) determined ages for a sample of 33 Kepler planet candidate host stars with high-S/N asteroseismic observations, achieving a median statistical uncertainty of 14% on age. We compare the ages fromF18 andB20 with the asteroseismic ages in Figure25. We find reasonably good agreement with the asteroseismic ages for bothF18 andB20. The residual scatter between the isochrone and asteroseismic ages is 0.11 dex forF18 and 0.22 dex forB20. In both cases, the residuals are comparable to the median age uncertainties from those catalogs.

Figure 25. Refer to the following caption and surrounding text. — **Figure 25.** Comparison ofF18 andB20 ages with asteroseismic ages from Silva Aguirre et al. (2015; top panels) and gyrochronology ages from this work.
Download figure:
Standard image High-resolution image

Gyrochronology ages were computed using thestardate software package (Angus et al.2019a,2019b).¹⁶ Thestardate ages were computed in the gyrochronology mode alone rather than in the combined isochrone-fitting and gyrochronology mode. Thestardate gyrochronology relations are calibrated in Gaia color space. Using the rotation periods we vetted in Section2, we noticed increased scatter in the (B_P −R_P)–P_rot plane compared to theT_eff –P_rot plane, whereT_eff is the CKS spectroscopic temperature fromF18. As such, rather than using the star’s actual Gaia colors, which are susceptible to reddening, we converted theF18 spectroscopicT_eff andB20 photometricT_eff to the predicted Gaia colors using the relation in Curtis et al. (2020), which was calibrated for stars with negligible reddening. Using the vettedP_rot and predicted (B_P −R_P) colors, we then computed the gyrochronology ages (without uncertainties). Our comparison of the isochrone and gyrochronology ages is shown in Figure25. We note that there is better agreement betweenF18 and the gyrochronology ages at young ages (<1 Gyr).

We note that thestardate model has not been updated to include recently determined open cluster rotation period sequences in its calibration. As such, we can compare the CKS sample to empirical gyrochrones from Curtis et al. (2020). This comparison is shown in Figure26, which shows that theF18 isochrone ages do not always map predictably onto theT_eff –P_rot plane. For example, in theF18 log(age) bin of 8.75–9 dex (≈0.6–1 Gyr), approximately half of the stars fall below the 1 Gyr gyrochrone, and half lie above it. Similarly, in theF18 log(age) bin of 9.5–9.75 dex (≈3.2–5.6 Gyr), a nonnegligible number of stars fall below the log(age) ≈ 9.4 (≈2.5 Gyr) gyrochrone. However, we note that the vast majority of stars withF18 isochrone ages of log(age) < 9.25 fall below the log(age) ≈ 9.4 gyrochrone. This is in agreement with the comparison made to thestardate gyrochronology ages, in the sense that the majority of stars withF18 isochrone ages ≲1.8 Gyr appear to be younger than ≲2.7 Gyr from a gyrochronology analysis.

Figure 26. Refer to the following caption and surrounding text. — **Figure 26.** In each panel, contours show the Gaussian kernel density estimate of CKS planet hosts in theT_eff –P_rot plane. Points depict stars withF18 isochrone ages indicated in the titles of each panel. The solid, dashed, dashed–dotted, and dotted lines indicate polynomial fits to the empirical gyrochrones of the Pleiades (log(age) ≈ 8.1), Praesepe (log(age) ≈ 8.8), NGC 6811 (log(age) ≈ 9), and NGC 6819 + Ruprecht 147 (log(age) ≈ 9.4) clusters, respectively (Curtis et al.2020).
Download figure:
Standard image High-resolution image

Finally, we also examined the evolution of other physical parameters known to correlate with age, such as the variability amplitudeR_var, near-UV (NUV) excess, and velocity dispersion. We tracked velocity dispersion using ${v}_{\tan }$ , the velocity tangential to the celestial sphere, andv_b, the velocity in the direction of the galactic latitude, sourced from Lu et al. (2021). GALEX NUV magnitudes were obtained from Olmedo et al. (2015). For a crude approximation of the NUV excess, we performed a quadratic fit to the full Kepler Q1-Q17 DR25 sample in the (G −G_RP) versus (NUV −K_s) color–color diagram. The NUV excess was then defined as a star’s (NUV −K_s) color minus the quadratic color–color trend. No dereddening was performed. Figure27 shows the evolution of these parameters as a function of age. For both theF18 andB20 isochrone ages, we observe increasing dispersion in ${v}_{\tan }$ andv_b with age, as expected. The strongest expected correlation is observed forR_var (sourced from Lu et al.2021) andF18 age, withR_var declining for the first ∼3 Gyr before plateauing. The average NUV excess appears to decline over a similar timescale when using theF18 ages, though that trend is less significant, and there may be residual systematics from the manner in which we computed the excess. Both theR_var and NUV excess trends are expected, as starspot coverage, variability amplitudes, and chromospheric activity are known to decline with age. By comparison, when using theB20 ages, the behavior ofR_var and NUV excess with age is not as expected. We conclude by noting that, while substantial uncertainties remain for isochronal ages, there is qualitative agreement between the CKS ages and the ages (or age indicators) derived from independent methods. In some of the comparisons above, the CKS and GKSPC ages perform comparably well, though it is at the youngest ages (≲3 Gyr) where the GKSPC ages do not reproduce some expected trends. As the evolution of small planets at early times is a primary focus of this work, we adopt the CKS ages.

Figure 27. Refer to the following caption and surrounding text. — **Figure 27.** Validation of isochronal age estimates. We showv_tan,v_b,R_var, and ${\mathrm{log}}_{10}({F}_{{\rm{NUV}}}/{F}_{{\rm{Ks}}})$ , from left to right, as a function of isochronal ages fromF18 (top row) andB20 (bottom row). Spearman rank correlation coefficients (ρ) andp-values are printed in the top left corner of each panel. Black points with error bars indicate the mean and standard deviation of the data binned by 0.125 dex in log(age).
Download figure:
Standard image High-resolution image

**Figure 27.** Validation of isochronal age estimates. We showv_tan,v_b,R_var, and ${\mathrm{log}}_{10}({F}_{{\rm{NUV}}}/{F}_{{\rm{Ks}}})$ , from left to right, as a function of isochronal ages fromF18 (top row) andB20 (bottom row). Spearman rank correlation coefficients (ρ) andp-values are printed in the top left corner of each panel. Black points with error bars indicate the mean and standard deviation of the data binned by 0.125 dex in log(age).
Download figure:
Standard image High-resolution image

Footnotes

8
https://gea.esac.esa.int/archive/
9
https://archive.stsci.edu/kepler/search_retrieve.html
10
https://exoplanetarchive.ipac.caltech.edu/docs/PurposeOfKOITable.html#cumulative
11
An example vetting sheet is presented in AppendixA, and all sheets are available through the journal.
12
The completeness curve in Figures2 and3 was computed for CKS stars using CKS stellar parameters and the methodology of Burke et al. (2015) as implemented in Python athttps://dfm.io/posts/exopop/.
13
We opted not to perform the gyrochronology classification in color space because we noted increased scatter in the (B_P–R_P)–P_rot diagram for CKS stars, possibly a result of reddening, metallicity effects, or both.
14
We note that cross-matching theV18 sample with Silva Aguirre et al. (2015) andF18 reveals that the asteroseismic sample contains host stars with a broad range of ages, from ≈2 to 12.5 Gyr.
15
https://github.com/trevordavid/radius-gap
16
https://github.com/RuthAngus/stardate

Please wait… references are loading.

10.3847/1538-3881/abf439

Movatterモバイル変換

Evolution of the Exoplanet Size Distribution: Forming Large Super-Earths Over Billions of Years

Article metrics

Share this article

Dates

Abstract

1. Introduction

2. Sample Selection

2.1. Rotation Period Vetting

3. Analysis

3.1. Evolution of the P-R Diagram: Isochrone Ages

3.2. Evolution of the P-R Diagram: Gyrochronology

3.3. Measuring the Slope of the Void

3.4. Calculation of False-alarm Probability

3.5. Effects of Stellar Mass and Metallicity

3.6. Accounting for Age Uncertainties

3.7. In What Ways Are Planets in the Valley Different?

3.8. Confounding Scenarios

3.9. Is the Radius Gap Empty?

4. Discussion and Conclusions

Appendix A: Rotation Period Vetting Sheets

Appendix B: Stellar Age Validation

Footnotes