December 20 2019

Department of Statistics,

Feng-Chia University

Taichung Taiwan 40724

Dear Dr. Park

Co-Editor, Computational Statistics and Data Analysis

I am very grateful to the Editor, the associate editor and two reviewers for their thoughtful and constructive suggestions. Below are my responses to their comments. The reviewer’s comments are in bold, and my responses are in plan text underneath. Thank you for your consideration.

Sincerely,

Dwueng-Chwuan Jhwueng

Ms. Ref. No.: CSDA-D-18-00866R1

Title: Statistical modeling for adaptive trait evolution in randomly evolving environment Computational Statistics and Data Analysis

Dear Dr. Dwueng-Chwuan Jhwueng,

Reviewers have now commented on your paper. You will see that they are advising that you revise your manuscript. If you are prepared to undertake the work required, I would be pleased to reconsider my decision.

Thank you so much for considering my work. I have undertaken the work required in this revision and hope to meet the standards of the CSDA.

For your guidance, reviewers’ comments are appended below.

If you decide to revise the work, please avail of the Response to Reviewers item in the process of submitting your revised paper. You should submit a list of changes or a rebuttal against each point which is being raised when you submit the revised manuscript.

To submit a revision, please go to https://ees.elsevier.com/csda/ and login as an Author. Your username is: dcjhwueng@fcu.edu.tw

If you need to retrieve password details, please go to: http://ees.elsevier.com/CSDA/automail_query.asp

On your Main Menu page is a folder entitled “Submissions Needing Revision”. You will find your submission record there.

NOTE: Upon submitting your revised manuscript, please upload the source files for your article. For additional details regarding acceptable file formats, please refer to the Guide for Authors at: http://www.elsevier.com/journals/computational-statistics-and-data-analysis/0167-9473/guide-for-authors

When submitting your revised paper, we ask that you include the following items:

Response to Reviewers (mandatory)

This should be a separate file labeled “Response to Reviewers” that carefully addresses, point-by-point, the issues raised in the comments appended below. You should also include a suitable rebuttal to any specific request for change that you have not made. Mention the page, paragraph, and line number of any revisions that are made. Manuscript and Figure Source Files (mandatory) We cannot accommodate PDF manuscript files for production purposes. We also ask that when submitting your revision you follow the journal formatting guidelines. Figures and tables may be embedded within the source file for the submission as long as they are of sufficient resolution for Production. Refer to the Guide for Authors for additional information.

http://www.elsevier.com/journals/computational-statistics-and-data-analysis/0167-9473/guide-for-authors

Highlights (optional)

Highlights consist of a short collection of bullet points that convey the core findings of the article and should be submitted in a separate file in the online submission system. Please use ‘Highlights’ in the file name and include 3 to 5 bullet points (maximum 85 characters, including spaces, per bullet point). See the following website for more information http://www.elsevier.com/highlights

Graphical Abstract (optional)

Graphical Abstracts should summarize the contents of the article in a concise, pictorial form designed to capture the attention of a wide readership online. Refer to the following website for more information:

http://www.elsevier.com/graphicalabstracts.

The journal would like to invite you to enrich your article by uploading relevant code and data to the RunMyCode data repository. Your article will be then linked to a dedicated RunMyCode companion website. To create a companion website, please follow step-by-step instructions available at: http://www.runmycode.org/

Thank you for invitation. Since RunMyCode asks for the standard link (url) from the journal, once upon acceptance of this work, let me upload the code and data. Currently, all scripts and data files can be accessed at https://tonyjhwueng.info/ououcir.

The revised version of your submission is due by Jan 11, 2020.

Yours sincerely,

BYEONG U. PARK, Ph.D. Co-Editor Computational Statistics and Data Analysis

Reviewers’ comments:

Associate Editor:

I have some comments in addition to the comments by reviewers.

In section 2,2, Author mentioned intractability of OUOUCIR model but justify by OUBMBM model as an example. It would be better directly explain intractability of OUOUCIR model.

Thank you very much for your insightful suggestion, the editorial change has been made. The intractability of the OUOUCIR model is directly explained in section 2.2 in this revision.

There is no explanation/discussion based on results of Tables 4 and 5.

I am sorry. I rechecked the codes and reran simulations. After summarized 5 independent runs where each run using 50000 replicates with acceptance rate \(\delta=0.01\) resulting 500 samples, Tables 4 and 5 are updated based on 5*500 posterior samples in this revision. In addition, a big table (Table 6) is added to show the distributions (prior, posterior, posterior mean and true parameter values). Explanation as well as discussion based on these three tables are added in the manuscript.

At the end of section 4.1, authors mentioned the results from Figures 7-8 in supplementary material. Although Figures are in the supplementary material, it would be better provide a summary of the result here.

Thank you very much for suggestion. The Figures are updated in this revision where the Bayesian coverage probability for the new models (OUOUCIR and OUBMCIR) are reported. A summary that indicates the higher estimated coverage probability is added in the manuscript. The plots of the coverage probabilities of each parameter under the OUOUCIR model is provided in Figure 4 in this revision while for the plots of the coverage probability of each parameter under the OUBMCIR model is provided in Figure S1 in the online supplemental material.

In Table 5, Including the true value of the parameters makes easier to compare. Also, why \(b_1\) is not well estimated (true value is 1)? Any reason?

Thank you for pointing out this issue. The true values of the parameters are added in the header of the Table from the simulation study for easier comparison with the posterior mean in each column. And after redo the simulation with more rigorous setting (i.e. 5 independent runs are performed where each run uses 1% tolerance rate from 50000 samples results in 5*500=2500 posterior samples for each parameter), the results shows that we have more accurate estimate of \(b_1\) where the posterior mean can be closed to the true value. But to be honest, there still exist bias for some runs, it may be useful to develop better procedure/algorithm in the near future to get more accurate estimates.

In Figure 4, OUBMCIR has some issue to identify. What could be the reason? Can you see any evidence while applying ABC?

Thank you very much for your insightful comment. It is a very interesting question, and after exploring with several different setting of parameters values and taxa size, I found that the main reason for the identifiability issue of the OUBMCIR model is due to the taxa size where the OUBMCIR model requires more taxa size to identify the correct model. Results are updated in Figure 5 in this revision where the OUBMCIR model can be identified well for balanced tree with 256 taxa and higher (over 77% correctly identifies the model). And in the online supplemental material Figure S2, the result shows that the OUBMCIR model can identified well (over 75% correctness) for birth-death tree with larger taxa size (500 taxa). We report this update to the reviewer here.

p. 19, line 3: “All priors are set to uniform distribution” Any reason? Why different from the setting in the simulation study?

Thank you so much for your insightful comment. Below is an explanation from my understanding. For empirical study, one of concerns is that we do not have information of the internal nodes states (ancestral values), we only have the trait value at the tip of tree with known branch lengths and topology. When the procedure is performed for generating the posterior samples, there is higher uncertainty for the trait value at each internal nodes. Hence, I feel that it may be more appropriate to assume that all possible values of parameter are equally likely. Therefore the uniform priors are considered to used for all parameters. In this revision, the results using the uniform priors in reported in Tables 4, 5 and 6 for keeping the manuscript better organized with the empirical analysis and cross validation (so all simulations and empirical analyses use uniform priors in this revision.) On the other hand, for the use of informative prior the simulation results can be accessed in Tables S1, S2 and S3 in online supplemental material.

Some figures such as Figures 3 and 6 seems too big. Could the figures be scaled for readability?

Thank you very much for your suggestions. Those figures have been re-scaled in this revision for better readability.

I assume that the supplementary material (section 7) will be a separate file.

Thank you very much for your points, the manuscript has been reorganized. The supplementary material containing 4 Tables and 2 Figures is now a separate file and is included in this submission.

Although editing English has been made, it still has several issues. I advise the author to get professional English editing service.

Thank you very much for suggestion. I am sorry that the English issues cannot meet the standard in previous revision. After seeking editorial help, the manuscript in the revision has been again carefully proofread (correct typos and grammar errors) and edited by a native speaker and professional expert in this research area. I hope the English in this version can meet the standard.

A few example of typos, minor issues, possible grammatical errors, etc.

p.4, line 2: access -> assess

Thank you, the editorial change has been made.

p.4, line 8: “the dynamic of the models of adaptive trait evolution” seems awkward

Thank you, the editorial change has been made.

p.4, table 1 caption: “Property of model of adaptive trait evolution” seems awkward as well

Thank you, the editorial change has been made.

p.5, line 2: “SimModelOnePathV2.R” does not fit in the text without any explanation

Thank you for your comment, the SimModelOnePathV2.R is the R file for generating figure 1. We have removed it in the text.

p.5, eq (7): What is AZ? There was explanation in the first version but now it is removed.

Thank you for your question. The \(A\) is a 3 by 3 matrix containing the forces parameter \(\alpha_y,\alpha_\theta\) and \(\alpha_\tau\) while \(Z_t=(y_t,\theta_t^y,\tau_t^y)'\) is the random vector. \(AZ\) is the vector of the linear combination of the force parameter and the random vector. The explanation is added to this revision.

p.7, line -5: Is ‘1’ in “OU1 process” typo?

Thank you for your question, OU1 process is the basic model in this area in Hansen 1997 paper. The editorial change is made to clarify this.

p.12, The sentence starts with “Approximate Bayesian Computation..” seems awkward

Thank you, the editorial change has been made.

p.15, line -5: What does it mean by “Student_t(1,0,2.5)”? The author needs to explain the notation.

Thank you for your suggestion, the editorial change has been made. It intends to say using a t distribution with 1 degree of freedom (a Cauchy distribution).

p.16, line -2, -5: values for % are not correct. 0.56% -> 56%, 0.18% -> 18%

Thank you, the editorial change has been made.

Reviewer no. 2:

I only have two minor comments:

It would be very helpful to add one short paragraph at the beginning of the Results (section 4), to summarize which results will be presented. For instance, the section 4.1 “Simulation” starts very abruptly and the reader needs to go back into the text to verify what simulations the author is referring to, and how these simulations were generated. Section 4.2 has this introductory paragraph and it it, thus, very easy to follow the rest. I suggest something similar for section 4.1.

Thank you very much for your constructive suggestion. An introductory paragraph of which results will be presented in provided in section 4.1 where two simulations (parameter estimation for all models and Bayesian coverage probability for each parameter in the new models) are described. Because this section is reorganized after several extensive simulations (Table 6) are performed, the arrangement has some changes in this revision for presenting the results. Hope this improves the quality of this paper to meet the standard of CSDA.

In your new cross-validation results you find that it is hard to distinguish between OUBMCIR and OUOUCIR when the true model is OUBMCIR (for all taxa sizes). Would you discuss why this might be the case?

Many thanks for your useful concern. A paragraph for discussing this issue has been added in this revision. After redo the simulation by using larger taxa size (please see Figure 4: using balanced tree of of 128, 256 and 512 taxa and Figure S2 in online supplemental material using balanced tree 500 taxa), it is easier to distinguish between OUBMCIR and OUOUCIR by using larger taxa size.

Moreover, two set of simulations are performed using different type of trees (balanced tree and birth-death tree). After comparing the results using Figure 5 and Figure S2 in supplemental material, I found that models can be better identified under balanced tree case than under birth-death trees case. This may be due to fact that the death-birth tree is randomly simulated at each replicate while the balanced tree is fixed throughout simulation. So the tree type does affect the identifiability issue of model by cross validation analysis, in particular for the OUBMCIR model.

Reviewer no. 3:

The authors proposed OUOUCIR and OUBMCIR models. However, there are a lot of texts on describing OUOUBM and OUBMBM, which have been studied by others. I think those contents are redundant and should be more concise. The paper should focus on the new models.

Thank you very much for your constructive suggestions. The editorial change has been made and now the manuscript focuses more on the new models (the OUBMCIR model and the OUOUCIR model). In particular, the explanation of the mathematical property of SDEs in the OUOUCIR model is added in section 2.1 and the explanation of the intractability for the OUOUCIR model is provided in section 2.2. Simulation for computing the Bayesian coverage probability focuses on the new models and the results is reported in section 4.1.2 (Please see Figure 4 in the manuscript and Figure S1 in supplemental material). Furthermore, the interpretation of the result from simulation on cross validation in this revision focuses more on discussion of OUBMCIR model and OUOUCIR model than other models.

The application of ABC methods on trait evolution has been studied by referred papers [30] and [31], which somehow affect the novelty and contribution of this work, though it is applied to the proposed new models.

Thank you very much for your constructive criticism, I full agree that the ABC methods has been studied by referred papers. Let me try to explain more about the contribution in this work. The new model in this work mainly focuses on using CIR process to model the rate of adaptive trait evolution \(\tau_t^y\). And due the distribution for the trait variable \(y_t\) in the new models has intractable likelihood, ABC is applied here for model inference. However, the models in the work [30] (for branching OU process) and [31] (for univariate continuous trait only) have known distribution and model likelihood although the authors instead/emphasize the use ABC approach for inference. Hope this explanation can distinguish my work from the work [30,31] in literature.

The revised version still has quite many problems on the writing, such as typos, grammar issues.

Thank you very much for your comments here. The typos and grammar errors from last revision are corrected in this revision. This manuscript has been carefully edited by a professional researcher (also a native speaker) in this research area. Hope this can meet the high standard in CSDA.

List of R code and R Data file in the manuscript and supplemental material.

Below is the list of the click-able link to the scripts and relevant file for reproducing the results (table, figure, plots) in this revision. I hope these links can be helpful for the associate editor and reviewers easier to review this work. All files can be accessed at https://tonyjhwueng.info/ououcir/, thank you.

In the manuscript

Figure/Table	link
Figure 1:	https://tonyjhwueng.info/ououcir/SimModelOnePath.html
Figure 2:	https://tonyjhwueng.info/ououcir/3taxaexample.html
Figure 3:	https://tonyjhwueng.info/ououcir/3taxaTraitdemoMod.pptx
Figure 4:	http://www.tonyjhwueng.info/ououcir/summaryplotHDI.html
Table 4 and 5	http://www.tonyjhwueng.info/ououcir/unifsimtable.html
Table 6:	http://www.tonyjhwueng.info/ououcir/unifdistplot
Figure 5:	http://www.tonyjhwueng.info/ououcir/cvbalancedtree.html
Figure 6:	http://www.tonyjhwueng.info/ououcir/sanchezLaskercoralPhyloPlot.html
Figure 7:	http://www.tonyjhwueng.info/ououcir/graphviz.html
Table 7 and 8:	http://www.tonyjhwueng.info/ououcir/paramstable.html

In the supplemental material.

Figure/Table	link
Tables S1 and S2:	http://www.tonyjhwueng.info/ououcir/nonunifsimtable.html
Table S3:	http://www.tonyjhwueng.info/ououcir/nonunifdistplot
Figures S1:	http://www.tonyjhwueng.info/ououcir/summaryplotHDI.html
Figure S2:	http://www.tonyjhwueng.info/ououcir/cvbirthdeathtree.html
Table S4:	http://www.tonyjhwueng.info/ououcir/EmpiricalMaincodeV2/treetraitV2/

GUIDELINES FOR PREPARING THE REVISED PAPER

Please take the following in to account when you prepare the final version of your paper:

Could you please write the abstract in the third person without having expressions like “We”, “In this paper”, “Here”, “This work”, etc. PLEASE AVOID EQUATIONS IN THE ABSTRACT.

Thanks, done.

Avoid references in the abstract. IF THIS IS REALLY NECESSARY,then provide complete and abbreviated information. I.e. (Authors, abbr. journal, pages, vol., year).

Thanks, done.

Do not use vertical lines in tables.

Thanks, done.

Add the tables at the appropriate place in the paper, i.e. NOT at the end. If you use LATEX, then please incorporate the figures at their appropriate place in the paper too.

Thanks, done.

Add full stops and commas at the end of equations.

Thanks, done.

The article is written using the CSDA guidelines (see Guide for Authors): http://www.elsevier.com/locate/csda
If you are using Latex, then could you please use the style files of the publishers which can be found at http://www.elsevier.com/locate/latex/journals/
The complete postal address, email, tel. and fax of the corresponding author should be shown as a footnote in the first page. Avoid having any other footnotes.
If you have attachments (software or data sets) that will appear as annexes in the electronic version of your mns, then please do mention this as a footnote in the first page.
In multiple equations have commas at the end of each eqn and an “and” between the last pair. E.g.eqn 1,eqn 2,eqn 3 and eqn

Thanks, done.

Data in Brief (optional):

We invite you to convert your supplementary data (or a part of it) into an additional journal publication in Data in Brief, a multi-disciplinary open access journal. Data in Brief articles are a fantastic way to describe supplementary data and associated metadata, or full raw datasets deposited in an external repository, which are otherwise unnoticed. A Data in Brief article (which will be reviewed, formatted, indexed, and given a DOI) will make your data easier to find, reproduce, and cite.

You can submit to Data in Brief via the Computational Statistics and Data Analysis submission system when you upload your revised Computational Statistics and Data Analysis manuscript. To do so, complete the template and follow the co-submission instructions found here: www.elsevier.com/dib-template. If your Computational Statistics and Data Analysis manuscript is accepted, your Data in Brief submission will automatically be transferred to Data in Brief for editorial review and publication.

Please note: an open access Article Publication Charge (APC) is payable by the author or research funder to cover the costs associated with publication in Data in Brief and ensure your data article is immediately and permanently free to access by all. For the current APC see: www.elsevier.com/journals/data-in-brief/2352-3409/open-access-journal

Please contact the Data in Brief editorial office at dib-me@elsevier.com or visit the Data in Brief homepage (www.journals.elsevier.com/data-in-brief/) if you have questions or need further information.

Include interactive data visualizations in your publication and let your readers interact and engage more closely with your research. Follow the instructions here: https://www.elsevier.com/authors/author-services/data-visualization to find out about available data visualization options and how to include them with your article.

MethodsX file (optional)

We invite you to submit a method article alongside your research article. This is an opportunity to get full credit for the time and money you have spent on developing research methods, and to increase the visibility and impact of your work. If your research article is accepted, your method article will be automatically transferred over to the open access journal, MethodsX, where it will be editorially reviewed and published as a separate method article upon acceptance. Both articles will be linked on ScienceDirect. Please use the MethodsX template available here when preparing your article: https://www.elsevier.com/MethodsX-template. Open access fees apply.

Thank you so much. An R package ouxy is developed and can be accessed at https://cran.r-project.org/web/packages/ouxy/index.html and a method article titled: “Building an adaptive trait simulator package to infer parametric diffusion model along phylogenetic tree” is included within this submission.

For further assistance, please visit our customer support site at http://help.elsevier.com/app/answers/list/p/7923. Here you can search for solutions on a range of topics, find answers to frequently asked questions and learn more about EES via interactive tutorials. You will also find our 24/7 support contact details should you need any further assistance from one of our customer support representatives.

Rebuttal Letter

Ms. Ref. No.: CSDA-D-18-00866R1

Reviewers’ comments:

Associate Editor:

Reviewer no. 2:

Reviewer no. 3:

List of R code and R Data file in the manuscript and supplemental material.

In the manuscript

In the supplemental material.

GUIDELINES FOR PREPARING THE REVISED PAPER