مدل‌سازی داده‌های چندمتغیره طولی با استفاده از توابع مفصل جفتی واین

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشگاه علوم پزشکی جندی شاپور اهواز، دانشکده بهداشت، گروه آمار و اپیدمیولوژی

2 دانشگاه شهید چمران اهواز، دانشکده علوم ریاضی و کامپیوتر، گروه آمار

3 مرکز تحقیقات تغذیه و بیماری‌های متابولیک، دانشگاه علوم پزشکی جندی شاپور اهواز، اهواز، ایران

چکیده

در برخی مطالعات پزشکی ممکن است چندین اندازه گیری بر روی هر بیمار داشته باشیم. در چنین شرایطی یک
روش، به‌کارگیری اثرات تصادفی در مدل‌سازی داده‌ها است. گاهی این داده‌های طولی ممکن است برای چندین متغیر
پاسخ اندازه‌گیری شود، در این حالت اگر چه می‌توان پاسخ‌ها را به صورت مجزا مدل‌بندی کرد اما چنین رویکردی موجب
کاهش توان و کارایی در برآورد اثرات متغیرهای کمکی روی متغیر پاسخ می‌گردد. در چنین مدل‌هایی علاوه بر تحلیل
وابستگی بین اندازه‌های مکرر مربوط به هریک از متغیرهای پاسخ، وابستگی بین پاسخ ها نیز باید مدل شود. از جمله
روش‌هایی که در سال‌های اخیر توجه بسیاری از محققان را برای مدل‌سازی داده‌های چند متغیره به خود جلب کرده است،
مدل سازی داده‌ها با استفاده از تابع مفصل است. از مهمترین مزیت های بکارگیری تابع مفصل نسبت به مدل‌سازی چند
متغیره طولی داده ها به روش کلاسیک این است می‌توان علاوه بر توزیع نرمال هر توزیع دیگری غیر از نرمال را به عنوان
توزیع های حاشیه ای در نظر گرفت. همچنین توزیع های حاشیه‌ای حتی می‌توانند توزیع‌های متفاوتی داشته باشند. در
شرایطی که داده‌ها ساختاری چند متغیره داشته باشند یکی از راه‌های تشکیل توزیع‌های چندمتغیره استفاده از مفصل‌های
جفتی و این است. ما در این مطالعه با استفاده از تابع مفصل‌های مختلف به کمک مفصل‌های جفتی واین ساختار طولی
چندمتغیره‌ای را تشکیل می‌دهیم و این مدل‌ها را با مدل حاصل از برازش تابع مفصل نرمال چند متغیره مقایسه می‌کنیم.
سپس بهترین مدل را با استفاده از معیار اطلاع آکائیک معرفی کرده و در پایان مدل ارائه شده را بر روی داده‌های برآورد
اثر تغذیه بر رشد نوزادان به کار خواهیم گرفت.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Modeling Multivariate Longitudinal Data Using Vine Pair Copula Constructions

نویسندگان [English]

  • Mohammad Sadegh LoeLoe 1
  • Mohammad Reza Akhoond 2
  • Kambiz Ahmadi Angali 1
  • Fatemeh Borazjani 3
1 Department of Biostatistics, Faculty of Health, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
2 Department of Statistics , Mathematical Sciences and Computer Faculty, Shahid Chamran University of Ahvaz, Ahvaz, , Iran
3 Nutrition and Metabolic Disease Research Center, Ahvaz Jundishapur University of Medical Science, Ahvaz, Iran
چکیده [English]

In some medical studies, we may have several measurements on each patient. Sometimes these longitudinal data may be measured for several response variables, in this case, although the responses can be modeled separately, such an approach reduces the power and efficiency in estimating the effects of auxiliary variables on the response variable. In the analysis of such data, in addition to the analysis of the dependence between repeated measures related to each of the response variables, the dependence between the responses should also be considered. Among the methods used in recent years to model multivariate data is the copulafunction. One of the most important advantages of using the copula function compared to the longitudinal multivariate modeling of the data in the classic way is that, in addition to the normal distribution, any other distribution other than the normal can be considered as marginal distributions. Also, marginal distributions can even have different distributions. In situations where the data have a multivariate structure, one of the ways to form multivariate distributions is to use vine pair-copula function. In this study, we form a multivariate longitudinal structure by using the vine pair copula functions and compare these models with the model obtained from the fitting of the multivariate normal copula function. Then we will introduce the best model using the Akaike information criterion and at the end we will use the presented model on the data of the estimation of the effect of nutrition on growth.

کلیدواژه‌ها [English]

  • Longitudinal measurments
  • Multivariate normal copula function
  • Vine pair copula
  • Infant growth
  • Infant nutration
[1] Aas, K. Czado, C. Frigessi, A. Bakken, H.,Paircopula constructions of multiple dependence, Insurance: Mathematics and economics, 2(44) (2009),18298.
[2] Amirhakimi, G., A longitudinal growth study from birth to maturity for weight, height and head circumference of normal Iranian children compared with western norms: A standard for growth of Iranian children, Iranian Journal of Medical Sciences, 28 (2015), 916.
[3] Bairakdar, R., Modeling Nested Copulas with GLMM Marginals for Longitudinal Data, Doctoral
dissertation, Concordia University, 2017.
[4] Bedford, T. Cooke, R. M.,Vines: A new graphical model for dependent random variables, Annals of Statistics , 30(4) (2002), 1031 1068.
[5] Behrman, R., Kliegman, R. and Jenson H. Nelson textbook of pediatrics, 16th Edition, WB Sauders Company, Philadelphia, 2000.
[6] De Leon, A. Zhu Y., ANOVA extensions for mixed discrete and continuous data, Computational Statistics & Data Analysis, 52 (2008), 22182227.
[7] De Leon, A.R. Wu, B., Copula‐based regression models for a bivariate mixed discrete and continuous outcome, Statistics in medicine, 2(30) (2011),17585.
[8] Diggle, P., Analysis of longitudinal data, Oxford University Press, United States, 2000.
[9] Fitzmaurice, G. Davidian, M. Verbeke, G. Molenberghs, G., Longitudinal data analysis, CRC Press,United States, 2008.
[10] Genest, C. MacKay, J. The joy of copulas: bivariate distributions with uniform marginals,The American Statistician,4 (40) (1986), 2803.
[11] Genest, C. Nešlehová, J. A primer on copulas for count data, Astin Bulletin, 2 (37) (2007), 475515.
[12] Green, C. J., Fibre in enteral nutrition, Clinical Nutrition, 20 (2001), 2339.
[13] Gueorguieva, R.V. Agresti A., A correlated probit model for joint modeling of clustered binary and continuous responses, Journal of the American Statistical Association, 455 (2001), 110, 212.
[14] Jiryaie, F. Withanage, N. Wu, B. de Leon, A.,Gaussian copula distributions for mixed data, with
application in discrimination, Journal of Statistical Computation and Simulation, 9 (86) (2016), 16431659.
[15] Joe, H., Multivariate models and multivariate dependence concepts, CRC Press, United States, 1997.
[16] Johnson, R.A. Wichern, D.W. ,Applied multivariate statistical analysis, Prentice hall Upper Saddle River, NJ, United States, 2002.
[17] Kim, J. M., Liao, S. M., Jung, Y. S. , Mixture of Dvine copulas for modeling dependence, Computational Statistics & Data Analysis, 64 (2007), 0119.
[18] Kole, E. Koedijk, K. Verbeek, M., Selecting copulas for risk management, Journal of Banking & Finance, 8 (31) (2007), 24052423.
[19] Kolev, N. Paiva, D.,Copulabased regression models: A survey, Journal of statistical planning and inference, 11 (139) (2009),38473856.
[20] Laird, N. M. Ware, J. H., Randomeffects models for longitudinal data, Biometrics, 38 (4) (1982),963974.
[21] Lee, Y. Nelder, J. A.,Hierarchical generalized linear models, Journal of the Royal Statistical Society Series B (Methodological), 58 (4) (1996), 619656.
[22] Lin, H.,DVine PairCopula Models for Longitudinal Binary Data, Doctoral dissertation, Old Dominion University, 2020.
[23] Nelsen R. B.,Copulas and association. Advances in probability distributions with given marginals, Springer, New York, 1991.
[24] Nelsen, R.,An introduction to copulas, Springer, New York, 2006.
[25] Nikoloulopoulos, A.K. Joe, H., Factor copula models for item response data, Psychometrika, 80(2015), 126130
[26] Pinheiro, J.C. Chao, E.C.,Efficient Laplacian and adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models, Journal of Computational and Graphical Statistics, 15 (2006), 5881.
[27] Radice, R. Marra, G. Wojtys, M., Copula Regression Spline Models for Binary Outcomes With Application in Health Care Utilization, Univ Coll Lond Res Rep,(2013).
[28] Reyhani, T. Ajam, M.,The comparative study of the 06 month childeren growth curve using formula and breast ffeding in Gonabad city, Iranian Academic Center for Education, Culture and Research, 6(2000), 4955.
[29] Roy, M., Conditional Dependence in Joint Modelling of Longitudinal NonGaussian Outcomes, University of Calgary, 2016.
[30] Nai Ruscone, M., & Osmetti, S. A,Modelling the dependence in multivariate longitudinal data by pair copula decomposition. InSoft methods for data science, 27 (2017), 373380.
[31] Sabeti, A. Wei, M. Craiu, R. V. Additive models for conditional copulas, Stat, 3 (1) (2014), 300312.
[32] Sefidi, S., Ganjali, M. Analysis of ordinal and continuous longitudinal responses using pair copula construction. METRON., 80 (2022), 255280.
[33] Shi, P., & Zhao, Z.Predictive modeling of multivariate longitudinal insurance claims using pair copula construction. arXiv preprint arXiv, 1 (2018), 18057301.
[34] Sklar, A., Distribution functions of n dimensions and margins, Publications of the Institute of Statistics of the University of Paris, 8 (1959), 22931.
[35] Skuladottir, A. Thome, M. Ramel, A., Improving day and night sleep problems in infants by changing day time sleep rhythm: a single group before and after study, International journal of nursing studies, 42 (2005), 843850.
[36] Song, P. X. K, Li, M. Yuan, Y., Joint regression analysis of correlated data using Gaussian copulas, Biometrics, 65 (2009), 608.
[37] Toutounchi, P., The weight to age growth chart in 5 years old children and its risk factors in Tehran, Iran, Iranian Academic Center for Education, Culture and Research, 8 (2009), 6773.
[38] Trivedi, P. K. Zimmer, D.M., Copula modeling: an introduction for practitioners, Now Publishers Inc, United States, 2007.
[39] Victora, C.G. Bahl, R. Barros, AJ. França, G.V. Horton, S. Krasevec, J., Breastfeeding in the 21st century: epidemiology, mechanisms, and lifelong effect, The Lancet, 387 (2016), 47590.
[40] Withanage, N. N. K. P. ,Methods and Applications in the Analysis of Correlated NonGaussian
Data, University of Calgary, 2013.
[41] Wu, B. de Leon, A. R., Gaussian copula mixed models for clustered mixed outcomes, with application in developmental toxicology, Journal of Agricultural, Biological, and Environmental Statistics, 19(2014), 3956.
[42] Xue Kun Song, P., Multivariate dispersion models generated from Gaussian copula. Scandinavian Journal of Statistics, 27 (2000), 15341561
[43] Yang, L., Czado, C., Two part Dvine copula models for longitudinal insurance claim data. Scandinavian Journal of Statistics, 49 (2022), 305320.
[44] Zimmer, D.M. Trivedi, P. K.,Using trivariate copulas to model sample selection and treatment effects:application to family health care demand, Journal of Business & Economic Statistics, 24 (2006), 6376