Big Data Capacity Improvement Project|Student Achievement Exhibition Series 4

84cb58401414c1b8b0f5433c5ff3c643.png

guide

In order to give full play to the multidisciplinary advantages of Tsinghua University, build an interdisciplinary integration platform, innovate the interdisciplinary training mode, and cultivate "π" talents with big data thinking and application innovation, the Graduate School of Tsinghua University, Tsinghua University Big Data Research Center and The "Tsinghua University Big Data Capability Improvement Project" jointly designed and organized by relevant departments has been implemented and has been recognized by teachers and students in the school. Through the integration and construction of course modules, the project has formed a big data course system combining big data thinking and skills, cross-border learning, and practical application, and an online and offline hybrid teaching mode, which has significantly improved students' big data analysis ability and innovative application ability .

Looking back in 2022, Tsinghua University's big data capability improvement project has achieved fruitful results. Students have successfully applied the data thinking and skills learned in the course to their major's study and scientific research. While seeing the charm of data science, they also Build yourself into a cross-compound innovative talent. Now, let us appreciate their demeanor through 10 student representatives from 8 departments!

Mechanism of Intermolecular Interactions on Metallocene Catalysts

76fa01c9a168ab2bd41833e6aaf4838b.png

highlights

(1) To establish a theoretical method for the regulation of catalytic performance by intermolecular interactions.

(2) Establish a statistical model to evaluate the energy and geometric description capabilities of different levels of theoretical methods for catalyzing the ligands of fused-ring aromatic compounds.

(3) Through the theoretical analysis results, the intermolecular forces are visualized, and the regulation of the intermolecular forces of the face-to-face configuration fused-ring aromatic dimer ligands on the new titanocene metal catalysts is clarified.

1e48ec1b393348c7d51b0b0f30260908.gif

4c19f21da7f3fef1a74bd2b863a79454.gif

c543e513f52f8851d16ba6b90d2c3c35.gif

This work focuses on the ideal configuration of a series of monoarene ligands and their face-to-face dimer ligands of aryloxy titanocene catalysts. The schematic diagram of the structure is shown in Fig.1. Among them, a represents a single aromatic hydrocarbon ligand, and the serial numbers 1, 2, 3, 4 and 5 correspond to benzene (C6H6), naphthalene (C10H8), anthracene (C14H10), pyrene (C14H10), anthracene (C16H10) and hexafluorobenzene ( C6F6); b represents the dimer ligand corresponding to the single aromatic hydrocarbon.

04df670f407a89c9b5bce95d1de02f11.png

Fig.1MolecularstructuresofauxiliaryligandsR[R=a1)C6H6,a2)C10H8,a3)C14H10,a4)C16H10,a5)C6F6andtheirsandwichdimers,b1)C6H6…C6H6,b2)C10H8…C10H8,b3)C14H10…C14H10,b4)C16H10…C16H10,b5)C6F6…C6F6]tomodifyhalf-titanocenecomplex.

Fig.2 shows the potential energy surface of the dimer ligand energy changing with the configuration when the scanning accuracy is 0.01Å. The black origin refers to the geometric configuration of ωB97XD/6-311++g(d,p) using CCSD The energy optimized by the /cc-pVTZ high-precision method, the method dependence is higher than the basis set dependence as a whole, among which, the M06-2X-D3 method overestimates the interaction energy; the B3LYP-D3(BJ) method has a certain Underestimated, but when the basis set increases to 3Zeta, the energy calculation gap between B3LYP-D3(BJ) and ωB97XD methods decreases. In summary, M06-2X-D3 combines the basis set with a diffusion function to describe the energy better. B3LYP-D3(BJ) and ωB97XD have 1-2kcal·mol-1 overestimation of interaction energy. The geometric and energy description errors of hexafluorobenzene (C6F6) are larger than those of other structures.

8eeaaa8877eedb8ceb7a45805556f2be.png

Fig.2Potentialenergycurves(kcal·mol-1)forseriesinteractionstructures(seeFig.1)ofsandwichdimerligands,a)C6H6…C6H6,b)C10H8…C10H8,c)C14H10…C14H10,d)C16H10…C16H10,e)C6F6…C6F6.

Fig.3 summarizes the dimer geometric structure and interaction energy corresponding to the stagnation point of the potential energy surface curve shown in Fig.2. The boxplot shows the geometric and energy optimization results of different methods. It can be seen that the face center distance of the optimal configuration of the hexafluorobenzene (C6F6) dimer is similar to that of the naphthalene (C10H8) dimer, and the face center distance of the alkane dimer is basically negatively correlated with the interaction energy, that is, the dimer is optimal The smaller the face center distance of the configuration, the greater the interaction energy. The dependence of the calculation method is generally higher than that of the basis set. The B3LYP-D3(BJ) method, especially in combination with the 6-31++g(d,p) basis set, may have a certain effect on the face center distance of the optimal configuration. Overestimation; In terms of energy description, the B3LYP-D3(BJ) method combined with the 2Zeta basis set 6-31g(d,p) and 6-31++g(d,p) may have a higher effect on the optimal configuration However, the M06-2X-D3 method combined with the 2Zeta basis set may underestimate the effect on the optimal configuration.

1beba15318b41563ca4386111b13589a.png

Fig.3Interactionenergies(kcal·mol-1)foroptimizedsandwichdimerstructuresbasedonthepotentialenergycurves(seeFig.2).

Energy decomposition is to decompose the molecular force into different physical components, so as to gain a deeper understanding of the nature of the interaction from the perspective of energy, one of which is the symmetric matching perturbation theory

(Symmetry-Adapted Perturbation Theory, SAPT). SAPT can decompose the interaction energy into four parts: exchange, induction, electrostatics and dispersion. The boxplot of Fig.4 shows the calculation results of the energy decomposition of the 12 optimized configurations of 5 arene dimers by the PSI4 program using the second-order perturbation SAPT2 combined with the jun-cc-pVDZ basis set. Among them, the exchange energy describes the mutual repulsion of molecules in close range. The value is positive and the larger the value, the more unfavorable the combination between molecules. The order is: b1<b2<b5<b3<b4; the induction energy describes the mutual polarization of molecular charges and mutual transfer, the value is negative and the greater the absolute value, the more favorable the combination, and its order (absolute value) is: b1<b2<b3<b4<b5; electrostatic energy describes the classical Coulomb interaction between fragments, the value is negative and absolute The larger the value, the greater the attraction, which is more conducive to the combination. The order (absolute value) is: b1<b2<b5<b3<b4; dispersion is expressed as the attraction between instantaneous dipoles, and the larger the absolute value, the more effective Favorable for combination, its order (absolute value) is: b1<b2<b5<b3<b4. In summary, except for C6F6, the exchange, induction, electrostatic and dispersion interactions all increase with the number of rings.

289ab3206d067dddafac8f3c33869e67.png

Fig.4SAPT2components(kcal·mol-1)ofthedifferencebetweenoptimizedsandwichdimerstructures(seeFig.2).

In order to investigate the relationship between the physical components of each weak interaction in energy decomposition and the optimal configuration and action energy of the dimer, the

The Lasso (Least absolute shrinkage and selection operator) method establishes a multiple regression model after feature selection. The result of the feature selection of the Lasso method shows that df=3 is the selection of the three features induction (induction), electrostatics (electrostatics) and dispersion (dispersion), which are not conducive to combination. The contribution of the exchange (exchange) is the smallest; df=2 is the selection The two features that contribute the most are electrostatics and dispersion. Optimal configuration of dimers obtained with df=3 and df=2 based on ωB97XD/6-311++G(d,p) method and based on coupled cluster method CCSD/cc-pVTZ and double hybrid functional method The high-precision effect obtained by PWPB95-D3/def2-QZVPP can establish a multiple regression model. The results are shown in Fig.5. It can be seen that the selected features can better fit the geometry and energy. When df=3, the residual error of energy prediction is about ±0.2kcal·mol-1, and the residual error of geometric prediction is about ±0.02Å; when df=2, the residual error of energy prediction is about ±0.5kcal·mol-1, and the residual error of geometric prediction About ±0.04Å.

fe48c0fdbfe8af4278f46bb1071cf1ae.png

Fig.5ResultsofmultifactorlinearregressionanalysisbetweenSAPT2components(seeFig.4)andoptimizedenergiesandstructuresofsandwichdimers(seeFig.2).

In order to compare the description performance of different calculation methods and basis sets for the interaction, R2 is used to describe the goodness of fit in Fig.5 and shown in Fig.6. The larger the value, the better the description performance. It can be seen that the R2 of the two features (df=2, electrostatic and dispersion) with large residuals is generally smaller than the three features (df=3, induction, electrostatic and dispersion); the basis group dependence is generally higher than the method dependence . In terms of energy description, the 6-31++(d,p) and 6-311++(d,p) basis sets R2 with the diffusion function are generally larger than the 6-31(d,p) without the diffusion function And the 6-311(d,p) basis set, in terms of geometric description, the advantage of the diffusion function of the basis set is more obvious when df=2.

65fdf46a14d74d51dea3f5315efb3939.png

Fig.6Performanceofdifferentmethods(seeFig.5)topredictinteractionenergiesandstructuresofsandwichdimersbySAPT2components.

In order to further explore the influence of intermolecular interactions on the geometric parameters of titanocene catalysts, feature selection was carried out by establishing the LASSO regression of the ADCH charge and geometric parameters of fragments (O, Ti, Cl, cene and ligands), and the results are shown in Fig. 11. Fig.11(a) and (b) respectively show the change of each characteristic coefficient and the change value of the error during the hyperparameter Lambda optimization process. When the error is the smallest, the coefficients of Cl and Mao are 0, indicating that the charge changes of Cl and Mao The influence on geometrical parameters is small, and the charge of O, Ti and ligand has a greater influence. Fig.11(c) and (d) respectively show the fitting of O, Ti and ligand charges to O-Ti bond length and RO-Ti bond angle, further proving the ADCH charges of O, Ti and ligand (CO,CTiandCLigands) is an important factor affecting the geometric parameters, and the prediction performance of the bond length LO−Ti is better than that of the bond angle AR−O−Ti.

d478dc7e8bbee7a98eec12453f5c7697.png

Fig.11Effectsofmolecularfragmentchargesongeometricparametersofhalf-titanocenecomplex,a)thevariationofcoefficientsandb)errorbarswiththehyperparameterlambdaofLASSOregression;resultsofmultiplelinearregressiontopredictc)LO−Tiandd)AR−O−Tibasedonfeatureselection.

Editor: Wen Jing

Proofreading: Cheng Anle

3a4430358b62d360d3910394de986967.png

Guess you like

Origin blog.csdn.net/tMb8Z9Vdm66wH68VX1/article/details/130096955