Transactions on Data Analysis in Social Science

Transactions on Data Analysis in Social Science

Determining the Factors Affecting the Incidence of Hypertension in Pregnant Women Using Data Mining Techniques

Document Type : Original Article

Authors
1 M.Sc. in Software Engineering, IT Supervisor, Virtual School, Tehran University of Medical Sciences, Tehran, Iran
2 Assistant Professor, Department of Computer Engineering, Faculty of Engineering, Imam Khomeini International University, Qazvin, Iran
3 Deputy of Administration and Finance, Virtual School, Tehran University of Medical Sciences, Tehran, Iran
Abstract
Hypertensive disorders in pregnancy are recognized as one of the major complications during gestation, posing serious risks to both the mother and the fetus. These disorders can result in stillbirths and preterm deliveries among otherwise normal pregnancies and are considered the third leading cause of maternal mortality worldwide. However, their exact etiology remains largely unknown. The main objective of this study was to identify the demographic factors influencing the incidence of hypertension in pregnant women using data mining algorithms. The study database included 4,818 records and 80 features, extracted from electronic health records registered in the Tehran University of Medical Sciences health centers through the SIB system of the Ministry of Health and Medical Education. The study followed the CRISP-DM methodology for implementation. Due to class imbalance in the dataset, modeling was performed in two ways: (1) using basic algorithms such as C5.4 decision tree, ID3, CHAID, and artificial neural networks; and (2) using ensemble methods that combined bagging and boosting with the aforementioned algorithms. According to the developed models, the most significant predictors of hypertension in pregnant women included negative Rh factor, maternal age, nutritional habits (consumption of fruits, salt, and type of oil), history of preeclampsia, smoking, marital status, and presence of other hypertensive risk factors. The results showed that the hybrid model combining C5.4 and CHAID decision trees achieved the highest accuracy (75%) in classifying hypertensive cases. The bagging ensemble with C5.4 and ID3 improved accuracy by 4.17%, while the bagging–neural network combination increased it by 30%. Other models employing bagging and boosting techniques did not show notable improvements.
Keywords

[1]      Umesawa, M., & Kobashi, G. (2017). Epidemiology of hypertensive disorders in pregnancy: Prevalence, risk factors, predictors and prognosis. Hypertension Research, 40(3), 213–220. https://doi.org/10.1038/hr.2016.126
[2]      Hutcheon, J. A., Lisonkova, S., & Joseph, K. S. (2011). Epidemiology of pre-eclampsia and the other hypertensive disorders of pregnancy. Best Practice & Research Clinical Obstetrics & Gynaecology, 25(4), 391–403. https://doi.org/10.1016/j.bpobgyn.2011.01.006
[3]      Mustapha, W. M., Sadique, S., Nabiee, R., & Mustapha, W. M. (2012). A comprehensive review of hypertension in pregnancy. Journal of Pregnancy, 2012, 105918. https://doi.org/10.1155/2012/105918
[4]      Marić, I., Tsur, A., Aghaeepour, N., Montanari, A., Stevenson, D. K., Shaw, G. M., & Winn, V. D. (2017). Cluster analysis to estimate the risk of preeclampsia in the high-risk Prediction and Prevention of Preeclampsia and Intrauterine Growth Restriction (PREDO) study. PLoS ONE, 12(3), e0174399. https://doi.org/10.1371/journal.pone.0174399
[5]      Kenny, L. C., Broadhurst, D. I., Dunn, W., Brown, M., North, R. A., McCowan, L., Roberts, C., Cooper, G. J. S., Kell, D. B., & Baker, P. N. (2010). Robust early pregnancy prediction of later preeclampsia using metabolomic biomarkers. Hypertension, 56(4), 741–749. https://doi.org/10.1161/HYPERTENSIONAHA.110.157297
[6]      Hasan, S. M. A., Hassan, M., Saha, S., Islam, M., Billah, M., & Islam, S. (2016). Dietary phytate intake inhibits the bioavailability of iron and calcium in the diets of pregnant women in rural Bangladesh: A cross-sectional study. BMC Nutrition, 2, 44. https://doi.org/10.1186/s40795-016-0064-8
[7]      Rayman, M. P., Bath, S. C., Westaway, J. A. F., Williams, P., Mao, J., Vanderlelie, J. J., Perkins, A. V., & Redman, C. W. G. (2015). Selenium status in UK pregnant women and its relationship with hypertensive conditions of pregnancy. British Journal of Nutrition, 113, 249–258. https://doi.org/10.1017/S000711451400364X
[8]      Romano, M. E., Hawley, N. L., Eliot, M. N., Calafat, A. M., Jayatilaka, N. K., Kelsey, K. T., McGarvey, S. T., Phipps, M. G., Savitz, D. A., Werner, E. F., & Braun, J. M. (2017). Variability and predictors of urinary concentrations of organophosphate flame retardant metabolites among pregnant women in Rhode Island. Environmental Health, 16, 40. https://doi.org/10.1186/s12940-017-0247-z
[9]      Wilson, K. L., Casey, B. M., McIntire, D. D., Halvorson, L. M., & Cunningham, F. G. (2012). Subclinical thyroid disease and the incidence of hypertension in pregnancy. Obstetrics & Gynecology, 119, 315–320. https://doi.org/10.1097/AOG.0b013e318240de6a
[10]   Amin, S., Agarwal, K., & Beg, R. (2013). Genetic neural network-based data mining in prediction of heart disease using risk factors. In Proceedings of the 2013 IEEE Conference on Information and Communication Technologies (pp. 1227–1231). IEEE. https://doi.org/10.1109/CICT.2013.6558288
[11]   Veerbeek, J. H. W., Hermes, W., Breimer, A. Y., van Rijn, B. B., Koenen, S. V., Mol, B. W., Franx, A., de Groot, C. J. M., & Koster, M. P. H. (2015). Cardiovascular disease risk factors after early-onset preeclampsia, late-onset preeclampsia, and pregnancy-induced hypertension. Hypertension, 65, 600–606. https://doi.org/10.1161/HYPERTENSIONAHA.114.04850
[12]   Hill, J., Hoyt, J., van Eijk, A. M., D’Mello-Guyett, L., ter Kuile, F. O., Steketee, R., Smith, H., & Webster, J. (2013). Factors affecting the delivery, access, and use of interventions to prevent malaria in pregnancy in sub-Saharan Africa: A systematic review and meta-analysis. PLoS Medicine, 10(7), e1001488. https://doi.org/10.1371/journal.pmed.1001488
[13]   Wiemer, H., Drowatzky, L., & Ihlenfeldt, S. (2019). Data mining methodology for engineering applications (DMME): A holistic extension to the CRISP-DM model. Applied Sciences, 9(12), 2407. https://doi.org/10.3390/app9122407
Volume 1, Issue 2
Spring 2019
Pages 59-70

  • Receive Date 16 January 2019
  • Revise Date 04 March 2019
  • Accept Date 21 May 2019