Barnhart, H. X., Haber, M. J., & Lin, L. I. (2007). An Overview on Assessing Agreement with Continuous Measurements. Journal of Biopharmaceutical Statistics, 17(4), 529–569. https://doi.org/10.1080/10543400701376480
Barnhart, H. X., Yow, E., Crowley, A. L., Daubert, M. A., Rabineau, D., Bigelow, R., Pencina, M., & Douglas, P. S. (2016). Choice of agreement indices for assessing and improving measurement reproducibility in a core laboratory setting. Statistical Methods in Medical Research, 25(6), 2939–2958. https://doi.org/10.1177/0962280214534651
Bartolo, R., & Averbeck, B. B. (2020). Prefrontal Cortex Predicts State Switches during Reversal Learning. Neuron, 106(6), 1044-1054.e4. https://doi.org/10.1016/j.neuron.2020.03.024
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376. https://doi.org/10.1038/nrn3475
Carleton, R. N., Norton, M. A. P. J., & Asmundson, G. J. G. (2007). Fearing the unknown: A short version of the Intolerance of Uncertainty Scale. Journal of Anxiety Disorders, 21(1), 105–117. https://doi.org/10.1016/j.janxdis.2006.03.014
Chen, J., Ooi, L. Q. R., Tan, T. W. K., Zhang, S., Li, J., Asplund, C. L., Eickhoff, S. B., Bzdok, D., Holmes, A. J., & Yeo, B. T. T. (2023). Relationship between prediction accuracy and feature importance reliability: An empirical and theoretical study. NeuroImage, 274, 120115. https://doi.org/10.1016/j.neuroimage.2023.120115
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6, 284–290. https://doi.org/10.1037/1040-3590.6.4.284
Clarke, P., & Wheaton, B. (2007). Addressing Data Sparseness in Contextual Population Research: Using Cluster Analysis to Create Synthetic Neighborhoods. Sociological Methods & Research, 35(3), 311–351. https://doi.org/10.1177/0049124106292362
Cokelaer, T., Kravchenko, A., lahdjirayhan, msat59, Varma, A., L, B., Stringari, C. E., Brueffer, C., Broda, E., Pruesse, E., Singaravelan, K., Russo, S. A., Li, Z., padgham, mark, & negodfre. (2024). cokelaer/fitter: V1.7.0 (Version v1.7.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.10459943
Costa, V. D., Tran, V. L., Turchi, J., & Averbeck, B. B. (2015). Reversal Learning and Dopamine: A Bayesian Perspective. Journal of Neuroscience, 35(6), 2407–2416. https://doi.org/10.1523/JNEUROSCI.1989-14.2015
Dajani, D. R., & Uddin, L. Q. (2015). Demystifying cognitive flexibility: Implications for clinical and developmental neuroscience. Trends in Neurosciences, 38(9), 571–578. https://doi.org/10.1016/j.tins.2015.07.003
de Winter, J. C. F., Gosling, S. D., & Potter, J. (2016). Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological Methods, 21(3), 273–290. https://doi.org/10.1037/met0000079
Doros, G., & Lew, R. (2010). Design Based on Intra-Class Correlation Coefficients. Current Research in Biostatistics, 1(1), 1–8. https://doi.org/10.3844/amjbsp.2010.1.8
Freyer, T., Valerius, G., Kuelz, A.-K., Speck, O., Glauche, V., Hull, M., & Voderholzer, U. (2009). Test–retest reliability of event-related functional MRI in a probabilistic reversal learning task. Psychiatry Research: Neuroimaging, 174(1), 40–46. https://doi.org/10.1016/j.pscychresns.2009.03.003
Gell, M., Eickhoff, S. B., Omidvarnia, A., Küppers, V., Patil, K. R., Satterthwaite, T. D., Müller, V. I., & Langner, R. (2023). The Burden of Reliability: How Measurement Noise Limits Brain-Behaviour Predictions (p. 2023.02.09.527898). bioRxiv. https://doi.org/10.1101/2023.02.09.527898
Gershman, S. J. (2016). Empirical priors for reinforcement learning models. Journal of Mathematical Psychology, 71, 1–6. https://doi.org/10.1016/j.jmp.2016.01.006
Gorgolewski, K. J., Storkey, A. J., Bastin, M. E., Whittle, I., & Pernet, C. (2013). Single subject fMRI test–retest reliability metrics and confounding factors. NeuroImage, 69, 231–243. https://doi.org/10.1016/j.neuroimage.2012.10.085
Hedge, C., Powell, G., & Sumner, P. (2018). The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behavior Research Methods, 50(3), 1166–1186. https://doi.org/10.3758/s13428-017-0935-1
Hirschfeld, G., Brachel, R. von, & Thielsch, M. (2014). Selecting items for Big Five questionnaires: At what sample size do factor loadings stabilize? Journal of Research in Personality, 53, 54–63. https://doi.org/10.1016/j.jrp.2014.08.003
Huang, J. L., Bowling, N. A., Liu, M., & Li, Y. (2015). Detecting Insufficient Effort Responding with an Infrequency Scale: Evaluating Validity and Participant Reactions. Journal of Business and Psychology, 30(2), 299–311. https://doi.org/10.1007/s10869-014-9357-6
Huys, Q. J. M., Cools, R., Gölzer, M., Friedel, E., Heinz, A., Dolan, R. J., & Dayan, P. (2011). Disentangling the Roles of Approach, Activation and Valence in Instrumental and Pavlovian Responding. PLOS Computational Biology, 7(4), e1002028. https://doi.org/10.1371/journal.pcbi.1002028
Huys, Q. J. M., Eshel, N., O’Nions, E., Sheridan, L., Dayan, P., & Roiser, J. P. (2012). Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees. PLOS Computational Biology, 8(3), e1002410. https://doi.org/10.1371/journal.pcbi.1002410
Izquierdo, A., Brigman, J. L., Radke, A. K., Rudebeck, P. H., & Holmes, A. (2017). The neural basis of reversal learning: An updated perspective. Neuroscience, 345, 12–26. https://doi.org/10.1016/j.neuroscience.2016.03.021
Kraus, B., Zinbarg, R., Braga, R. M., Nusslock, R., Mittal, V. A., & Gratton, C. (2023). Insights from Personalized Models of Brain and Behavior for Identifying Biomarkers in Psychiatry. Neuroscience & Biobehavioral Reviews, 105259. https://doi.org/10.1016/j.neubiorev.2023.105259
Kretzschmar, A., & Gignac, G. E. (2019). At what sample size do latent variable correlations stabilize? Journal of Research in Personality, 80, 17–22. https://doi.org/10.1016/j.jrp.2019.03.007
Liljequist, D., Elfving, B., & Roaldsen, K. S. (2019). Intraclass correlation – A discussion and demonstration of basic features. PLOS ONE, 14(7), e0219854. https://doi.org/10.1371/journal.pone.0219854
Lydon-Staley, D. M., Barnett, I., Satterthwaite, T. D., & Bassett, D. S. (2019). Digital phenotyping for psychiatry: Accommodating data and theory with network science methodologies. Current Opinion in Biomedical Engineering, 9, 8–13. https://doi.org/10.1016/j.cobme.2018.12.003
Maas, C. J. M., & Hox, J. J. (2004). Robustness issues in multilevel regression analysis. Statistica Neerlandica, 58(2), 127–137. https://doi.org/10.1046/j.0039-0402.2003.00252.x
Maas, C. J. M., & Hox, J. J. (2005). Sufficient Sample Sizes for Multilevel Modeling. Methodology, 1(3), 86–92. https://doi.org/10.1027/1614-2241.1.3.86
Marek, S., Tervo-Clemmens, B., Calabro, F. J., Montez, D. F., Kay, B. P., Hatoum, A. S., Donohue, M. R., Foran, W., Miller, R. L., Hendrickson, T. J., Malone, S. M., Kandala, S., Feczko, E., Miranda-Dominguez, O., Graham, A. M., Earl, E. A., Perrone, A. J., Cordova, M., Doyle, O., … Dosenbach, N. U. F. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902), Article 7902. https://doi.org/10.1038/s41586-022-04492-9
McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30. https://doi.org/10.1037/1082-989X.1.1.30
Neubauer, A. B., Voelkle, M. C., Voss, A., & Mertens, U. K. (2020). Estimating Reliability of Within-Person Couplings in a Multilevel Framework. Journal of Personality Assessment, 102(1), 10–21. https://doi.org/10.1080/00223891.2018.1521418
Paccagnella, O. (2011). Sample Size and Accuracy of Estimates in Multilevel Models. Methodology, 7(3), 111–120. https://doi.org/10.1027/1614-2241/a000029
Papoutsaki, A., Sangkloy, P., Laskey, J., Daskalova, N., Huang, J., & Hays, J. (2016). WebGazer: Scalable Webcam Eye Tracking Using User Interaction. 3839–3845.
Piray, P., & Daw, N. D. (2020). A simple model for learning in volatile environments. PLOS Computational Biology, 16(7), e1007963. https://doi.org/10.1371/journal.pcbi.1007963
Piray, P., Dezfouli, A., Heskes, T., Frank, M. J., & Daw, N. D. (2019). Hierarchical Bayesian inference for concurrent model fitting and comparison for group studies. PLOS Computational Biology, 15(6), e1007043. https://doi.org/10.1371/journal.pcbi.1007043
Reddy, L. F., Waltz, J. A., Green, M. F., Wynn, J. K., & Horan, W. P. (2016). Probabilistic Reversal Learning in Schizophrenia: Stability of Deficits and Potential Causal Mechanisms. Schizophrenia Bulletin, 42(4), 942–951. https://doi.org/10.1093/schbul/sbv226
Schaaf, J. V., Weidinger, L., Molleman, L., & van den Bos, W. (2023). Test–retest reliability of reinforcement learning parameters. Behavior Research Methods. https://doi.org/10.3758/s13428-023-02203-4
Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47(5), 609–612. https://doi.org/10.1016/j.jrp.2013.05.009
Smith, P. L., & Little, D. R. (2018). Small is beautiful: In defense of the small-N design. Psychonomic Bulletin & Review, 25(6), 2083–2101. https://doi.org/10.3758/s13423-018-1451-8
Spisak, T., Bingel, U., & Wager, T. D. (2023). Multivariate BWAS can be replicable with moderate sample sizes. Nature, 615(7951), E4–E7. https://doi.org/10.1038/s41586-023-05745-x
Tiego, J., Martin, E. A., DeYoung, C. G., Hagan, K., Cooper, S. E., Pasion, R., Satchell, L., Shackman, A. J., Bellgrove, M. A., & Fornito, A. (2023). Precision behavioral phenotyping as a strategy for uncovering the biological correlates of psychopathology. Nature Mental Health, 1(5), Article 5. https://doi.org/10.1038/s44220-023-00057-5
Waltmann, M., Schlagenhauf, F., & Deserno, L. (2022). Sufficient reliability of the behavioral and computational readouts of a probabilistic reversal learning task. Behavior Research Methods. https://doi.org/10.3758/s13428-021-01739-7
Williams, B., & Christakou, A. (2022). Dissociable roles for the striatal cholinergic system in different flexibility contexts. IBRO Neuroscience Reports, 12, 260–270. https://doi.org/10.1016/j.ibneur.2022.03.007
Yu, C., Beckmann, J. F., & Birney, D. P. (2019). Cognitive flexibility as a meta-competency / Flexibilidad cognitiva como meta-competencia. Estudios de Psicología, 40(3), 563–584. https://doi.org/10.1080/02109395.2019.1656463
Zorowitz, S., Solis, J., Niv, Y., & Bennett, D. (2023). Inattentive responding can induce spurious associations between task behaviour and symptom measures. Nature Human Behaviour, 7(10), 1667–1681. https://doi.org/10.1038/s41562-023-01640-7