The Accuracy of Rotational Parameters Predicted by High-level Quantum-Chemical Calculations: The Case Study of Sulfur-Containing Molecules of Astrochemical Interest

The accuracy of rotational parameters obtained from high-level quantum-chemical calculations is discussed for molecules containing second-row atoms. The main focus is on computed rotational constants for which two statistical analyses have been carried out. A first benchmark study concerns sulfur-bearing species and involves 15 molecules (for a total of 74 isotopologues). By comparing 15 different computational approaches, all of them based on the coupled-cluster singles and doubles approach (CCSD) augmented by a perturbative treatment of triple excitations, CCSD(T), we have analysed the effects on computed rotational constants due to: (i) extrapolation to complete basis-set limit, (ii) correlation of core electrons, and (iii) vibrational corrections to rotational constants. To extend the analysis to other molecules containing secondrow elements, as well as to understand the effect of higher excitations, a second benchmark study involving 11 molecules (for a total of 54 isotopologues) has been performed. Finally, the rotational parameters of seven sulfur-containing molecules of astrochemical interest (CCS, C3S, C4S, C5S, HCCS +, HC4S + and HOCS+/HSCO+) have been computed and compared to experimental data, when available, also addressing the direct comparison of simulated and experimental rotational spectra.


Introduction
Rotational spectroscopy plays a crucial role in astrochemical and astrophysical studies because molecular species in the interstellar medium (ISM) are in most cases detected via their rotational signatures. 1,2 Indeed, the observation of spectroscopic signatures provides the unequivocal proof of the presence of the chemical species of interest, 3 which in turn is the starting point for the development of astrochemical models. To this end, the rotational spectra of molecules of astrophysical and -chemical interest are investigated in laboratory studies: rotational transitions are recorded in extended frequency regions and then analysed and assigned. The derived spectroscopic parameters are used to predict rotational transitions that have not been directly measured; therefore, the accuracy and reliability of these constants are of great importance. In fact, the detection of a molecular species in space by rotational spectroscopy critically depends on the availability of accurate predictions of rotational transitions and in turn of the corresponding spectroscopic parameters. Among them the rotational constants are the most important ones. 4 Nowadays, the availability of large instantaneous bandwidth receivers, which are also characterized by high spectral resolution, allows for large spectral surveys with uniform sensitivity. These surveys are clearly the ideal means to obtain a complete census of the species that emit in the recorded spectral survey. However, the simultaneous presence of the spectroscopic features of several molecules requires the rotational transition frequencies to be known with an accuracy of a few hundreds of kHz or even better.
To guide the first steps of the spectral assignment procedure in laboratory, as well as to verify the reliability of rotational spectroscopic parameters determined from experiments, quantum-chemical calculations are increasingly employed (see, for example, refs. [5][6][7]. To support the experimental investigation computational predictions of rotational spectra should be accurate, thus implying that high-level quantum-chemical calculations are to be used. These predictions involve two tasks, namely (a) the determination of the required spectroscopic parameters, and (b) the simulation of the spectra based on the given set of spectroscopic parameters. The latter typically involves the diagonalization of an appropriate Hamiltonian, while the accurate determination of the spectroscopic parameters requires a quantum-chemical treatment of the molecular system under consideration. In the last decade, numerous theoretical studies reporting computed rotational constants have been published (see, for example, refs. 5,6,8 and 9). However, a systematic investigation of the accuracy achievable in quantum-chemical computations of these parameters has been carried out for most parts only for molecular species containing first-row elements. 8 There is a great astrophysical interest in S-containing molecules because, despite the fact that simple diatomic species up to astronomical complex molecules such as ethyl mercaptan (CH 3 CH 2 SH) have been detected in interstellar gas clouds and circumstellar outflows (see, for example, refs. 10,11), the details on the sulfur chemistry that takes place in space are mostly unknown. Indeed, while sulfur compounds in the gas phase have been observed in the diffuse ISM with almost no depletion with respect to its solar abundance, 12 the sulfur abundances detected in the cold and dense ISM are much lower. 13 This is known as the "sulfur exhaustion problem" 14 with the chemical form of the missing sulfur not yet being identified. For this reason, there is a great interest in the spectroscopic characterization of neutral as well as ionic sulfur-containing species.
The quantum-chemical approaches employed throughout our calculations are based on coupled-cluster techniques. 15 To investigate their accuracy two systematic benchmark studies have been carried out. The first one focuses on the rotational constants of Sbearing species (with 15 different molecules and 74 isotopologues being considered).
Starting from the coupled-cluster singles and doubles approach (CCSD) augmented by a perturbative treatment of triple excitations, CCSD(T), 16 in conjunction with a polarized triple-zeta quality basis set, computational approaches which differ in the treatment of core-correlation and in the extrapolation to the basis-set limit have been considered. To extend our analysis as well as to understand the effect of higher-order excitations, a second benchmark investigation involving molecules containing different second-row elements, namely S, P, S and Cl (with 11 molecules and 54 isotopologues being considered), has also been performed.
The present investigation is intended to provide a statistical analysis for rotational constants of molecules containing second-row elements, with a particular focus on sulfurbearing species. The computational methodologies employed in the benchmark studies have also been applied to sulfur-containing molecules of strong astrochemical interest, thus providing an accurate spectroscopic characterization of the thiocumulenes family C n S, with n=2, 3, 4 and 5, and of the protonated species HCCS + , HC 4 S + , and HSCO + /HOCS + .
While for the neutral molecules some experimental data are available, these are completely lacking for the cations, thus preventing so far the possibility of their detection in space.
Therefore, for the C n S family the main aim is to complement the experiment, while for the cationic species the present work provides the first accurate computational spectroscopic characterization. The paper is organized as follows. In the succeeding section computational details are described with particular emphasis on the calculations of the spectroscopic parameters of interest and the benchmark studies. In the subsequent section, the results of the two benchmark studies are reported and discussed. Finally, the accuracy obtained for two different categories of species, small-and medium-sized molecules, of astrochemical interest are presented and discussed.

Computational details
In the present study, quantum-chemical calculations are based on the coupled-cluster (CC) theory 15 for the treatment of the electron correlation and have been performed using the hierarchic series of correlation consistent basis sets developed by Dunning and collaborators. [17][18][19] For open-shell species, the unrestricted-HF reference wavefunction has been employed. The quantum-chemical package CFOUR 20 has been used throughout,with the only exception of the CC singles, doubles and triples, CCSDT, 21,22 and CC with singles, doubles, triples, quadruples, CCSDTQ, 23 calculations which have been carried out using the general CC program MRCC by Kállay 24 interfaced to the CFOUR package.

Spectroscopic parameters
The theoretical background of the spectroscopic parameters needed for the accurate simulation of the rotational spectrum are briefly reviewed in the following subsections. The focus is mainly on the computational tasks these parameters require for their determination.

Rotational constants
Rotational constants are the main parameters that determine the rotational spectrum.
Molecules can have up to three different rotational constants (A B C) corresponding to the three inertial axis (a, b, and c). In our case, those of interest are the rotational constants for the vibrational ground-state, generally denoted as B i 0 , with i indicating an inertial axis.
The vibrational ground state rotational constants can be decomposed as: where B i e is the equilibrium rotational constant, i.e., the one that corresponds to the minimum of the Born-Oppenheimer (BO) potential energy surface (PES), while DB i 0 represents its vibrational correction.
The equilibrium contribution depends only on the isotopic composition of the molecule and on its equilibrium structure r e . Consequently, B i e is obtained in a straightforward manner from a geometry optimization. By means of second-order vibrational perturbation theory (VPT2), 25 the vibrational correction can be expressed as: where the sum runs over all normal coordinates (r) with v r as the corresponding vibrational quantum number. The a i r in eq. (2) are the vibration-rotation interaction constants. 5 To evaluate the DB i 0 term, an anharmonic force field has to be computed. The reader is referred to refs. 26-29 for a detailed description on how this task can be accomplished.
Since B i e provides by far the largest contribution to B i 0 , high accuracy is required for its determination. This can be achieved by means of composite approaches, with the latter permitting to evaluate various contributions separately, each at the highest possible level of theory, and to combine them together using the additivity scheme. 8,30,31 The contributions that are combined are obtained in the following manner: 1. Extrapolation to the complete basis-set (CBS) limit of the Hartree-Fock energy using the exponential three-point formula by Feller 32 in conjunction with either the cc-pVQZ, cc-pV5Z, and cc-pV6Z basis sets (option 1) or the cc-pVTZ, cc-pVQZ, and cc-pV5Z basis sets (option 2). 3. The core-correlation contribution: This term is obtained as the difference between all-electron and fc-CCSD(T) calculations using the same basis set, either cc-pCVQZ or cc-pCVTZ. 35,36 The contribution is referred to as core/cc-pCVQZ or core/cc-pCVTZ, respectively..

Full treatment of triple excitations beyond CCSD(T). This correction is obtained as
the energy difference between the CCSDT and CCSD(T) levels of theory using the cc-pVTZ basis set.
The contribution is referred to as DT.
5. Full treatment of the quadruple excitations, which is obtained as the energy difference between CCSDTQ and CCSDT. Due to its high computational cost, this contribution is computed using a small basis set, cc-pVDZ.
This contribution is referred to as DQ.
By combining all of these contributions, different composite schemes can be defined, with the fc-CCSD(T)/CBS(5,6) + core/cc-pCVQZ + DT + DQ one being the theoretically most complete and thus the best one. It is noted that the extrapolation schemes above were originally developed for energies; however, their use for forces (gradients) 8,30,31 can be justified because the corrisponding geometry optimization provides the minima on the extrapolated potential-energy surface. The

Centrifugal-Distortion Constants
When aiming at high accuracy in the prediction of spectra, additional rotational parameters need to be considered. In particular, the quartic and sextic centrifugal-distortion constants have a non negligible effect on the actual frequencies of rotational transitions. The quartic centrifugal-distortion constants (D, d) require the evaluation of a harmonic force field, while the sextic constants (H, h) necessitate the computation of a cubic force field. The levels of theory considered in the present work for these constants are fc-CCSD(T)/cc-pVTZ, fc-CCSD(T)/cc-pVQZ, and CCSD(T)/cc-pCVQZ.

Electron Spin-Rotation Constant
In the case of open-shell molecules, a coupling between the magnetic moment associated to the spin of the unpaired electrons and the magnetic field generated by the rotation motion occurs. This interaction is described by the electron spin-rotation coupling constant, denoted as g. The latter is calculated in a perturbative manner as second derivative of the energy with respect to the electron spin and the rotational angular momentum as perturbations, as described in Ref. 38. For the calculations presented in the following, the CCSD method has been used in conjunction either with the aug-cc-pCVQZ basis set 19,39 or, within the frozen-core approximation together with the aug-cc-pVTZ set.

Benchmark study
Based on the fact that theoretical predictions are based on approximate quantum-chemical schemes, a proper estimate of the corresponding uncertainty is needed. Numerous studies on the accuracy of computed molecular properties and energies, 40-42 geometrical parameters, 30,31,43-45 and spectroscopic quantities 5,8 have been reported in the literature. However, a lack of reliable statistics for systems containing second-row atoms is noted in the case of rotational constants. To fill this gap, the results of two benchmark studies are reported in the following.
To analyze the rotational parameters obtained from the different schemes, statistical measures are used. This means that for each computational approach we report the mean error (D), the mean absolute error (D abs ), the maximum error (D max ), and the standard deviation (D std ). By denoting as D k the deviation of the computed value from its corresponding experimental counterpart, these statistical measures are defined as: with the sums running over the n available values. All experimental rotational constants are considered with the same statistical weight in the analysis. This means that for linear molecules the weight of the unique rotational constant is 3, while for the asymmetric tops the weight of each rotational constant is 1.
Assuming that the errors follow a normal distribution, a pictorial representation of our results for each computational approach is obtained via: where 1 p 2pD std is a normalization constant.

Benchmark I
The first benchmark focuses on the rotational constants of S-bearing species and the results for 15 different computational approaches have been compared.
We start with the following levels of theory: To evaluate the effect of core correlation we continue with the following schemes: where the former stands for all-electron calculations at the CCSD(T) level using cc-pCVQZ.
The second approach instead use an additivity scheme and considers the core-correlation contribution obtained at the CCSD(T)/cc-pCVTZ level on top a fc-CCSD(T)/cc-pV5Z calculation.
To estimate the CBS limit, the extrapolation formula previously mentioned have been used.
Using the additivity schemes, core correlation can also be considered for the last two approaches. Both the cc-pCVTZ and cc-pCVQZ basis sets have been employed, thus yielding to four different composite schemes: and in this way a total of 74 isotopologues were considered in our first benchmark (see Table I).

Benchmark II
To extend the analysis to other molecules containing second-row elements as well as to investigate the effect of higher excitations, a second benchmark investigation has been carried out using a set of 11 small molecules (and a total of 54 isotopologues) containing other second-row elements: HCl, SiS, PN, CS, OCS, H 2 S, HBS, FBS, HCP, FCP, and ClCP.
The complete list of isotopologues considered in this benchmark is given in Table II.

Rotational constants of sulfur containing molecules of astrochemical interest
Based the previous statistical analyses, the rotational constants for some member of the thiocumulenes family C n S, with n=2, 3, 4 and 5 have been determined. Thiocumulenes are linear molecules with a typical cumulene bond distance, their electronic ground state depending on the number n of carbon atoms present. 46,47 Indeed, if n is odd the molecule is a closed-shell species, while, if n is even, the molecule is an open-shell system in a X 3 S electronic ground state. Since H + is abundant in the ISM, the protonated species HCCS + , HC 4 S + , and HSCO + /HOCS + have also been investigated.
Due to the different computational cost required by quantum-chemical calculations, the systems under investigation have been classified in two categories: 1. "Small molecules" (CCS, HCCS + , HSCO + , and HOCS + ) with a maximum of three non-hydrogen atoms; 2. "Medium-sized molecules" (C 3 S, C 4 S, HC 4 S + and C 5 S) with more than three nonhydrogen atoms; For the "small molecules" set, the best theoretical approach considered is the fc-

Benchmark I
The rotational constants considered in the statistical analysis range from 2.7 to 310 GHz, therefore only a discussion in terms of relative errors is meaningful. 8

Computed vs. Semi-experimental Equilibrium Rotational Constants
The results of the first comparison, i.e. those involving the semi-experimental B i e as reference, are reported in table III, thus showing the statistical measures for the various levels of theory considered.
Extrapolation to the CBS limit. The first contribution addressed is the convergence to the CBS limit, for which a graphical representation is provided in figure 1. It is noted that, as the basis set enlarges from cc-pVTZ to cc-pV5Z, both the mean and standard errors lower. This trend is graphically represented by tighter normal distribution functions, which are centred closer to the y-axis. However, while extrapolation to the CBS limit improves  Core correlation corrections. The second contribution that needs to be addressed is the effect of core correlation, with the pictorial representation given in figure 2. To understand the importance of this term, the comparison between the fc-CCSD(T)/cc-pVQZ and CCSD(T)/cc-pCVQZ levels of theory is first of all discussed. Going from the former to the latter, the standard deviation halves and the mean error decreases from -0.81% to -0.04%.
Focusing the attention on additive schemes, the performance of the two core-valence basis sets used in the treatment of core correlation, namely cc-pCVTZ and cc-pCVQZ, can be analyzed. According to our statistics, when the core-correlation contribution is considered in conjunction with the extrapolation scheme, the cc-pCVTZ basis set interestingly provides slightly better mean errors, which is somewhat unexpected. However   cc-pVTZ and cc-pVQZ basis sets.

Computed vs. Experimental Ground-state Rotational Constants
The previous conclusions are also confirmed by the second comparison carried out for which the statistical measures are reported in table IV (with the corresponding statistical representation given in the SI). As shown in figure 3, the fc-CCSD(T)/CBS(T,Q)-based approaches have definitely lower accuracy with respect to those based on the larger basis sets and even when vibrational corrections are added, the D std still remains around 0.5%.

Vibrational
Corrections. An important contribution to be discussed within the second comparison is the effect of vibrational corrections. Table IV points  + core/cc-pCVTZ + DB i 0 /cc-pVTZ and fc-CCSD(T)/CBS(Q,5) + core/cc-pCVQZ + DB i 0 /cc-pVTZ composite schemes seems to be related to the core-valence correlation treatment, with the rotational constants appearing to be slightly more overestimated when using the core-valence correlation contribution calculated with the cc-pCVQZ basis set instead of that cc-pCVTZ. However, the better performance of the CCSD(T)/cc-pCVTZ level is most likely due to error compensations between the extrapolation to the CBS limit and the core-valence contribution and/or missing DT and DQ terms.

Internal Theoretical Comparison
Moving to the third, internal comparison, we note that the fc-CCSD(T)/CBS(Q,5) + core/cc-pCVTZ level shows very similar error statistics compared to the best theoretical approach (see table VI and the corresponding graphical representation in the SI). Indeed, the standard deviation of this scheme is only 0.04% and the mean error is -0.1%. The composite scheme used for the characterization of the medium-sized molecule shows a standard deviation of 0.5% with respect to the best theoretical approach and a mean error quite small, this being 0.08%. As expected, all frozen-core results are very far from the best theoretical values and

Benchmark II
In addition to investigate the role played by full triples and quadruples corrections, this benchmark study is also intended to analyze the effect of increasing the basis set from a quadruple-zeta to a quintuple-zeta in the evaluation of the core-correlation contribution.

Accurate prediction of rotational spectra for S-containing molecules of astrochemical interest
This section aims at discussing the results obtained for the specific sulfur-containing molecules of astronomical interest. Furthermore, general conclusions on the level of theory required for the accurate prediction of rotational spectra in the specific case of molecular species containing second-row elements are drawn.

Small-sized molecules
The results obtained for CCS, HCCS + and the protonated species of OCS are presented with particular emphasis on the accuracy obtained for each spectroscopic parameter. To discuss the accuracy of the computational schemes investigated, we have selected the CCS radical, for which the best theoretical results are reported in table VIII, where they are also compared with the best experimental data. The rotational spectrum was simulated using Pickett's SPCAT program 96  with the experimental one with the relative error being ⇠18%.
The theoretical data reported in table VIII have been used to simulate the rotational  MHz. If we compare the last result with the spectrum simulated using the complete set of theoretical values VIII (see discussion in the previous paragraph), differences of about 0.16% are noted in reproducing the experimental spectrum is noted. Since in both cases the same level of theory is used for B 0 , by inspecting this table, it is apparent that this discrepancy is mostly due to the limited accuracy of the computed electronic spin-rotation interaction constant g.
A simulation of the entirely computed rotational spectrum has also been performed also for the protonated species of CCS and OCS, namely, HCCS + and HSCO + , for which the experimental data present in the literature are missing or limited. A summary of the computed spectroscopic parameters is reported in the SI and will be of future interest for experimental measurements. Indeed, the same accuracy as that obtained for CCS is expected. However, when comparing the theoretical and experimental 98 rotational constants of HOCS + , larger relative errors are observed: 0.43%, 0.025% and 0.015% for the A 0 , B 0 , and C 0 constants, respectively. Nevertheless, these experimental data were obtained only with a very limited number of low frequency measurements and call for an improvement, and for this reason, HOCS + has not been included in the benchmark set of molecules. In particular, the accuracy of the experimental rotational constants is very limited, with a standard deviation from the fit as large as 4 MHz. Results of mixed quality have been also obtained for the quartic-centrifugal distortion constants, with errors ranging from 6 to 13%. However, once again, the experimental counterparts are not particularly accurate with one parameter (d 1 ) clearly wrongly determined. Furthermore, it should be noted that, as already mentioned in ref. 99, despite the fact that HSCO + is about 20 kJ/mol lower in energy than HOCS + , only the latter one has been observed experimentally; this renders both protonated species worthy of further experimental investigation.

Medium-sized molecules
The discussion of the results for the medium-sized molecules is mainly focussing on the C 3 S molecule. The theoretical data from the present work data are reported in table IX together with the corresponding experimental values. We note that the relative error for the rotational constant obtained at the fc-CCSD(T)/CBS(T,Q) + core/cc-pCVTZ + DB i 0 (cc-pVTZ) level is ⇠0.03%, thus being nearly one order of magnitude smaller than that predicted by the benchmark study. The same trend is observed for C 5 S, for which the relative error is ⇠0.05%. Instead, for C 4 S a relative error of 0.13% has been obtained.
However, also this uncertainty is lower than what is expected based on the benchmark study, thus suggesting that the latter might provide rather conservative error estimates. The relative error for the quartic-centrifugal distortion constant of C 3 S is around 6%, consistent with previous results, even if the parameter has been obtained at the CCSD(T)/cc-pVQZ level of theory within the frozen-core approximation. A similar relative error is also observed in the case of C 5 S, while it doubles for the C 4 S radical.
As shown in figure 5, the agreement between the experimental and predicted spectra in the case of C 3 S is still good. The transition highlighted in the inset indicates that deviations of ⇠100 MHz are observed for frequencies above 300 GHz, i.e., deviations of about ⇠0.04% in relative terms. Results of similar quality are seen for C 4 S and C 5 S and are expected for HC 4 S + . For all these molecules, the computational results are collected in the SI.

Concluding remarks
By employing different composite schemes, an accurate spectroscopic characterization has been carried out for the thiocumulenes family C n S, with n=2, 3, 4 and 5, and for the HCCS + , HC 4 S + , and HSCO + /HOCS + protonated species. While the former family of molecules mainly allowed us to investigate the accuracy obtainable, for the cationic species accurate predictions of the rotational parameters have been provided for the first time. Different composite approaches have been investigated and their accuracy has been addressed thanks to a benchmark study. In particular, for HCCS + , HSCO + , and HOCS + the CCSD(T)/CBS(5,6) + core/cc-pCVQZ + DT + DQ + DB i 0 (cc-pCVQZ) level of theory is   expected to provide rotational constants with an accuracy better than 0.01%. For HC 4 S + , the best approach considered is CCSD(T)/CBS(T,Q) + core/cc-pCVTZ + DB i 0 (cc-pVTZ), which should predict rotational constants with an accuracy better than 0.1%.
From the inspection of the results collected in Tables III and VII,  The optimized geometrical parameters that are at the basis of the statistical analysis reported in Table VII point out that moving from core/cc-pCVQZ to core/cc-pCV5Z leads to an additional shortening of the bond distances. This means that, once the corecorrelation contribution is evaluated at the CCSD(T)/cc-pCV5Z level and coupled with extrapolation to the CBS limit performed with basis sets as large as cc-pV5Z and cc-pV6Z, the combination of the two effects ends up in bond lengths which appear to be too short, thus leading to computed rotational constants that slightly overestimate the corresponding experimental values.
Overall, the statistical analyses of the present study suggest that (i) the CCSD(T)/CBS(5,6) + core/cc-pCVQZ level provides high-accuracy results, not requiring full-triples and fullquadruples contributions; (ii) similar accuracy can be obtained only with the inclusion of the latter corrections if the extrapolation to the CBS limit is performed with smaller basis sets; (iii) the employment of the cc-pCV5Z basis set in the evaluation of the core-correlation contribution is discouraged unless in conjunction with CCSD(T)/CBS(Q,5); (iv) in the case of medium-sized molecules containing second-row elements, geometry optimizations using the CCSD(T)/CBS(T,Q) + core/cc-pCVTZ composite scheme in conjunction with vibrational corrections evaluated at the fc-CCSD(T)/cc-pVTZ level are able to provide rotational parameters that are suitable and sufficiently accurate to guide experiment.

Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: ...

Experimental values and best theoretical estimates for the rotational constants of all
the isotopologues considered in the benchmark I study.
2. Experimental values for all the rotational constants considered in the benchmark II study.
3. Graphical representations of the comparison between computed and experimental rotational constants as well as of the theoretical internal comparison.

Acknowledgments
This