# Introduction urvival analysis is a statistical method for data analysis where the length of time, ?? 0 corresponds to the time period from a well-defined start time until the occurrence of some particular event or endpoint ?? ?? , i.e. ?? = ?? ?? ? ?? 0 , Ata and Sozer (2007).It is a common outcome measure in medical studies for relating treatment effects to the survival time of the patients. In these cases, the typical start time is when the patient first received the treatment, and the end point is when the patient died or was lost to follow-up. These developments have led to the introduction of several new extensions to the original model. However the Cox PH model may not be appropriate in many situations and other modifications such as stratified Cox model or Cox model with time-dependent variables can be used for the analysis of survival data. The AFT model is another alternative method for the analysis of survival data. Hence, the importance is to compare the performance of the Cox models and the AFT models. This will be studied by means of real dataset which is from a cohort of TB/HIV co-infected patients managed in tertiary Directly Observed Treatment Short (DOTS) Course centre for a period of six months among the Nigerian adults. Cox regression model in the presence of nonproportional hazards was considered by Ata and Sozer (2007). They worked on alternative different models in the violation of proportional assumption. They analysed the treatment and prognosis effects with censored and survival data, makes the assumption of constant hazard ratio. David (2014) produced data for the simulation experiments that mimic the types of data structures applied researchers encounter when using longitudinal biomedical data. Validity was assessed by a set of simulation experiments and results indicate that a nonproportional hazard model performs well in the phase of violated assumption of the Cox proportional hazards. Jiezhi (2009) compared the proportional hazards (PH) model and parametric AFT models. The major aims of his work was to support the argument for consideration of AFT model as an alternative to the PH model in the analysis of survival data by means of real life data from TB and HIV in Uganda. There are two advantages of Cox proportional regression models, which are ability to incorporate time varying covariate effects and timevarying covariates (Cox, 1972). Ogungbola et al (2018) there research established that the model provides a better description of the dataset because it allows prediction of Hazard function, survival functions as well as time ratio. The result revealed that the Weibull model provided a better fit to the studied data. Hence, it is better for researchers of TB/HIV co-infection to consider AFT model even if the proportionality assumption is satisfied. Kazeem et al (2015)considered the application of survival analysis has extended the importance of statistical methods for time to event data that incorporate time dependent covariates. The Cox proportional hazards model is one such method that is widely used. An extension of the Cox model with timedependent covariates was adopted when proportionality assumption are violated. The purpose of this study is to validate the model assumption when hazard rate varies with time. This approach is applied to model data on duration of infertility subject to time varying covariate. # Methodology a) Study and Sampling Procedure The population target for this study comprises all Patients with Tuberculosis related cases/issues in the DOTs Clinic of NIMR who had been registered between 2011 and 2016. The research design is a cross sectional design. The study was carried out at the DOTs Clinic of the Nigerian Institute of Medical Research (NIMR). A parastatal under the Federal Ministry of Health that has treated over 5000 TB patients in the last 6 years. The Institute has a Directly Observed Treatment Short Course (DOTS) centre where it attends to patients infected with TB. All patients that were enrolled between 2011 and 2016 was included in the study; it enabled the completion of the 6months treatment cycle for those enrolled in 2016. # Log rank test: This was used to compare the death rate between two distinct groups, conditional on the number at risk in the groups. The log rank test hypothesis that; H 0 : All survival curves are the same H 1 : Not all survival curves are the same. Log rank test approximates a chi-square test which compares the observed number of failures to the expected number of failure under the hypothesis. Chisquared test is used. A large chi-squared value implies a rejection of the null hypothesis for the alternative hypothesis. # b) Cox Proportional Hazard Model The non-parametric method does not control for covariates and it requires categorical predictors. When we have several prognostic variables, we must use multivariate approaches. But we cannot use multiple linear regression or logistic regression because they cannot deal with censored observations. We need another method to model survival data with the presence of censoring. One very popular model in survival data is the Cox proportional hazards model, which is proposed by 7 . The Cox Proportional Hazards model is given by ?(??/??) = ? 0 (??) exp??? 1 ?? 1 +?? 2 ?? 2 + ? +?? ?? ?? ?? ? = ? 0 (??)exp (? ? x)(1) where ? 0 (??) is called the baseline hazard function, which is the hazard function for anindividual for whom all the variables included in the model are zero, ?? = (?? 1 , ?? 2 , . . , , ?? ?? ) ? is the values of the vector of explanatory variables for a particular individual, and ?? ? = (?? 1 , ?? 2 , ? , ?? ?? ) is a vector of regression coefficients. The corresponding survival functions are related as follows: This model, also known as the Cox regression model, makes no assumptions about the form of ? 0 (??) (non-parametric part of model) but assumes parametric form for the effect of the predictors on the hazard (parametric part of model). The model is therefore referred to as a semi-parametric model. The beauty of the Cox approach is that this vagueness creates no problems for estimation. ??(??/??) = ?? 0 (??)?????? ? ?? ?? ?? ?? ) ?? ??=1(2) Even though the baseline hazard is not specified, we can still get a good estimate for regression coefficients ??, hazard ratio, and adjusted hazard curves. The measure of effect is called hazard ratio. The hazard ratio of two individuals with different covariates ?? and ?? * is ???? ? = ? 0 (??)exp (? ? x) ? 0 (??)exp (? ? x * ) = exp [? ?? ?? (?? ? ?? * )](3) This hazard ratio is time-independent, which is why this is called the proportional hazards model. # Limitation of Cox # c) Accelerated Failure Time Model Accelerated Failure Time model (AFT model) is a parametric model that provides an alternative to the commonly used proportional hazards models. Whereas a proportional hazards model assumes that the effect of a covariate is to multiply the hazard by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. The assumption of AFT model can be expressed as ??(??/??) = ?? 0 (exp(?? ?? ??) ??) for?? ? 0(4) Where (??/??) is the survival function at the time t and the ?? 0 (exp(?? ?? ??) ??) is the baseline survival function at the time t. From this equation (1), AFT model can states that the survival function of an individual with covariate x at the time t is same as the baseline survival function of the time (exp(?? ?? ??) ??). The factor (exp(?? ?? ??) is known as the acceleration factor. The acceleration factor is the key measure of association obtained in the AFT model. It is a ratio of survival times corresponding to any fixed value of survival time. The general log-linear representation of AFT model for ith individual is given as log ?? ?? = ?? + ?? 1 ?? 1?? + ?? 2 ?? 2?? + ?? ?? ?? ???? + ???? ??(5) Where ????ð??"ð??"???? represents the log-transformed survival time, (?? 1 ,??..????) are the explanatory variables with the coefficients (?? 1 ,??..????),???? is the residual term and assumes a specific distribution and ??is the intercept and ?? is the scale parameters respectively. # Types of AFT Models There are various types of AFT models, they are as follows: 1) Exponential and Weibull Model 2) Log-normal AFT model 3) Log-logistic AFT model 4) Gamma AFT model We shall be explaining just the first two in this research: i. Exponential and Weibull AFT model: The exponential distribution was studied 1 st in connection with kinetic theory of gasses 4 The survival function of log-normal AFT model is given by ?? ?? (??) = ? 1 1+?? ( (????ð??"ð??"?? ??? ??? ?? ?? ?? ?????? ?? ?? ?? ?? ) ?(9) The cumulative hazard function of log-logistic AFT is given by ?? ?? (??) = ?????ð??"ð??"?? ?? (??) = ? log ?1 ? ?????? (????ð??"ð??"?? ??? ??? ?? ?? ?? ?????? ?? ?? ?? ?? ? # Various goodness of fit Test: There are various goodness of fit test, they are: # Analysis and Discussion We can see from # LOG Rank Test H o : The effect of the three regimens does not have significant to TB preventive therapy for TB/HIV coinfected adults. # H 1 : Not H o : In Table 1. Since P -value (.0192) < (? = 0.05), the effect of the three regimens does have significant to TB preventive therapy for TB/HIV co-infected adults. Then survival distributions are different in the population which make the result more statistically significance. By the log-rank test, in the preventive therapy, there is significant difference among three regimens of TB preventive therapy for TB/HIV co-infected adults, since the p-value is 0.0192 against 5% level of significance. The K-M curves for time to educate length and time to combined event of the preventive therapy is presented (Figure 1 # .). a) Cox Proportional Hazard Model In Table 2, since P -value < (? = 0.05): SEX, HAEMO GLUC, BMI and LYMPHABS, then they are statistically significant. The coefficient for Creatinine is positive, telling us that greater Creatinine values are associated with greater hazard and therefore shorter survival. The coefficient for weight is negative -normal body weight will be associated with a lower hazard and longer survival among the therapy population. The coefficient of LYMPHABS is negative showing that there is no significant reduction in CD4 cells which will be associated with a lower hazard and longer survival. The CD4 cells are the cells that the HIV Virus kills. As HIV infection progresses, the number of these cells decline. When the CD4 counts drops below 200 due to advance HIV disease, a person is diagnosed with AID. A normal range for CD4 lies between 500-1500. If haemoglobin content is also reduced, then the possibility of survival will be greatly affected. The BMI estimate of parameter is also negative, and then there will be associated lower hazard and longer survival. The results of a PH model fitted to this dataset are obtained (Table 3) ? ?? (??) = ? 0 (??)exp (0.328?????? ? 0.520?????? ? 0.004?????????????? + 0.366?????? ? 0.001???????????????? ? 0.160?????????? + 0.002?????????? ? 0.005???????????? ? 0.679???????? After a Cox PH model is fitted, the adequacy of this model, including the PH assumption and the goodness of fit, needs to be assessed. The PH assumption checking with graphical method and two statistical test methods. Omnibus Test: From Table 4, since the P-value (0.009) < (0.05), we have statistical reasons to reject H o and conclude that the parameter of the model are more stable and can be totally relied on in evidence based decision making regarding the TB/HIV preventive therapy. Also, the log-likelihood supported the significant of the model parameter estimate. # b) Accelerated Failure Time Models In F hold then, the log minus log plot will be parallel. For this reason, the investigation of Accelerated Failure. Time Model comes into play. In univariate AFT models, age, haemoglobin, body mass index, sex, and absolute lymphocyte count are not statistically significantly associated with time to sputum conversion of TB/HIV co-infected patients. The results from the different AFT models applied to the time to sputum conversion are presented in Tables 5, 6, 7, and 8. There is no big difference for the estimations in different models. Accelerated failure time models were compared using statistical criteria (likelihood ratio test and AIC). The Weibull in table 8 reveals that age and sex are statistically significant while HAEMO GLUC, BMI and LYMPHABS are not significant with their p-value greater than 0.05. We compared all these AFT models using statistical criteria (likelihood ratio test and AIC). The nested AFT models can be compared using the likelihood ratio (LR) test in Table 10. The Cox model, loglogistic model and the Weibull model are nested within the log-normal model (Table 10). According to the LR test, the weibull model fits better. However, the LR test is not valid for comparing models that are not nested. In this case, we use AIC to compare the models (Table 11), (The smaller AIC is the best). The Weibull AFT model appears to be an appropriate AFT model according to AIC compared with other models, although it is only slightly better than Log-logistic or Log-normal model. We also note that the Cox model and Lognormal model are poorer fits according to LR test and AIC. This provides more evidence that the PH assumption for this data is not appropriate. At last, we conclude that the Weibull model is the best fitting the AFT model based on AIC criteria. # IV. # Conclusion In this research, our findings revealed the absence of protection of TB/HIV preventive therapies on sputum conversion, death and combined event of the conversion and death. The study presents similar estimates of risk for the covariates with the previous study based on the baseline variables in the Cox Proportional Hazard model. But the PH assumption does not hold for LYMPHABS in this analysis. We also use .three different AFT models to fit the data. We find that the weibull AFT model fit better for this dataset. The univariate PH models, the SEX, HAEMO GLUC, BMI and LYMPHABS are lesser than p-value, then they are statistically significant. The coefficient for Creatinine is positive, telling us that greater Creatinine values are associated with greater hazard and therefore shorter survival. The coefficient for weight is negative-normal body weight was associated with a lower hazard and longer survival among the therapy population. The coefficient of LYMPHABS is negative showing that there is no significant reduction in CD4 cells which will be associated with a lower hazard and longer survival. Men have longer survival time and sputum conversion time than women. The risks of TB/HIV progression, death and the combined event of TB/HIV and death are higher among old adults. Log-rank test was able to show us that effect of the three regimen have significant association to the TB/HIV co-infected preventive therapy. Moreso, through Omnibus Tests of Model, we were able to deduce that there is no significant difference in time to sputum conversion of the TB/HIV co-infected patients on therapy. Telling us that the model is statistically adequate and significant According to the Cox PH model with timedependent variables, the predictive effect of absolute lymphocytes count clearly changes at about 2 years. Before 2 years, the hazard is less than one, which indicates that the risk of TB/HIV as absolute lymphocyte count increases. According to the log-logistic AFT model, LYMPHABS prolongs the time to sputum conversion as it increases along the process. The PH model is routinely applied to the analysis of survival data. The study considered here provides an example of a situation where AFT model is appropriate and where the PH model provides a little better description of the data set. We have seen that the PH model is a less valuable and realistic alternative to the AFT model in some situations. AIC shows us that weibull AFT model fits better when compared to the other models. This study is based on a large number of participants from Lagos residents in Nigeria, where the prevalence of TB infection and HIV are very high. In this study, the Cox PH model and the AFT model have been compared using TB/HIV co-infected data. Association of the TB/HIV preventive therapies with the sputum conversion is examined through the linkage of the signs and symptoms to replication of the virus. The Cox model expresses the multiplicative effect of covariates on the hazard. The AFT model provides an estimate of the survival function time ratios. In this research, we have analyzed the TB/HIV dataset using these alternative methods. This study provides an example of a situation where the AFT model is appropriate and where the PH model provides a little description of the data since logminus-log plot is not parallel. The Cox proportional hazard assumption does not hold in this dataset. We select the model that best describes the data. In addition, the example illustrates that the AFT model have a more realistic interpretation and provides more informative results as compared to Cox PH model for the available data. Therefore, a) We suggest that using the Cox PH model may not be the optimum approach. The AFT model may provide an alternative method to fit some survival data. b) Determining the effect of the three regimens may be additional values to researches. The results from this model could then be compared with the standard AFT models and Cox PH models. In addition, further study can be carried out to evaluate the effects of practical cases such as large censoring. ![Performance of Cox Proportional Hazards and Accelerated Failure Time Models in the Tuberculosis/HIV Co-Infected Survival Data](image-2.png "") ![PH Model: Cox regression model in the case of violation of the assumption of proportional hazards. It is improper to use a simple Cox regression model with regard to the violation of proportional hazard assumptions as it can lead to false deductions.](image-3.png "") II.Cox model withtime-varying covariates remains a flexible model insurvival analysis of patients with acute severe illness.Schei ke (2004) presented some development that dealtwith time varying effect of covariates. He alsoemphasized the use of semi-parametric models wheresome effects are time-varying and some are time-constant, thus giving the extended flexibility only foreffects where a simple description is not possible. Time-varying effects may be modelled completely non-parametrically by a general intensity model,i ?(t) = (t, X (t)) i ?. Smoothing techniques have beensuggested for estimation of ?(.); see, e.g., Nielson andLinton (1995) and the references therein. Such a modelmay be useful when the number of covariates is smallcompared to the amount of data, but the generality ofthe model makes it difficult to get a clear, if any,conclusion about covariate effects. Yuanxin (2013) builtup a Cox proportional hazards model by survivalanalysis using the SAS statistical package. To processthe analysis, the proportional assumption or timedependence for individual factors is tested; variables areselected; and their interactions are considered tooptimize the model. Due to strikingly impact of genderon the prediction, it is stratified. Therefore differentbaseline hazards are applied for the set of variableswithin each group. In the model, the parameters areestimated by maximum likelihood Newton-Raphsonalgorithm. The results show that gender, status ofdiabetes, age, body mass index, cholesterol and bloodpressure are found impacting the diseasesonset/development. Interestingly, the education levelhas its influence on it as well. In this research, weapplied the model into the sputum conversion of the TB/HIV which are co-infected patients managed in tertiaryDOTS centre for a period of 6 months among theNigeria adults. We also make use of the knowledge ofpercentage of censoring, variation in sample sizes. Allthese contribute to the existing knowledge. 7) Akaike Information Criterion8) Hosmer-Lemeshow test9) Kuiper's test10) Kernelized Stein Discepancy11) Zhangs Z K , Z C Z A test12) Moran testAIC: To compare various semi-parametric andparametric models Akaike Information Criterion (AIC) isused. It is a measure of goodness of fit of an estimatedstatistical model. In this study, AIC is computed asfollows?????? = ?2(????ð??"ð??" ? ??????????????????) + 2(?? + ??)(10)Where P is the number of parameters and K is thenumber of coefficients (excluding constant) in themodel. For P=1, for the exponential, P=2, for Weibull,Log-logistic, III.1) Bayesian Information Criterion2) Kolmogorov-Smirnov test3) Cramer-von Mises Criterion4) Anderson-Darling test5) Shapiro Wilk test6) Chi-squared test 6Covariate??Life-ExpnSe(coeff)Wald pCD4-0.0140.9890.0310.659Weight-0.0610.9280.0840.465BMI0.6271.8580.4870.349Glucose-0.0230.9770.0160.852Haemoglobin0.1461.1580.1610.009Creatine-0.0000.9990.0060.079 7Covariate??Life-ExpnSe(coeff)Wald pCD4-0.0110.9190.0340.50Weight-0.0750.9080.0970.440BMI0.3361.39590.3760.371Glucose-0.0220.9780.0150.145Haemoglobin0.1361.1460.1760.438Creatine-0.000010.9990.0050.984 8DistributionmLLRdfCox model2-42.961115.1421Log-logistic2-100.532326.4601Weibull3-263.762440.4522Log-normal2-43.536 9DistributionLog-likelihoodkcAICCox Model61256. 214Log-logistic-100.53262225. 156Weibull-263.76261218. 079Log-normal-43.53662235. 019 © 2021 Global Journals ## Acknowledgement We will like to acknowledge the Director and Institutional Review Board (NIMR-IRB) of National Institute Medical Research, Yaba, Lagos for their approval for the effective use of their patients' data. ## Appendices * Cox Regression Models with Non proportional Hazards applied to Lung Cancer survival data Ata MSozer Hacettepe Journal of Mathematics and Statistics 36 2 2007 * Semi-Parametric Hazard Ratio applied to Engineering Insurance System AMAyman International Journal of Engineering Research and Application 2 2 2012 * Global tuberculosis control: surveillance, planning, financing World Health Organization 79 2007 * Generating Survival Times to Simulate Cox PH Models RBender TAugustin MBlettner Wiley Online Library 24 11 338 2005 * Regression mode0ls and lifetables DRCox Journal of the Royal Statistical Society Series B 34 1972 * Data Generation for the Cox Proportional Hazards Model with Time-Dependent Covariates: A Method for Medical Researchers JHDavid Statistics in Medicine 33 3 2014 * The efficiency of Cox's likelihood function for censored data BEfron J. Am. Statist. Assoc 72 1977 * Semi-Parametric Non-Proportional Hazard Model With Time Varying Covariate AAKazeem AAAbiodun RAIpinyomi Journal of Modern Applied Statistical Methods 14 2 2015 Article 9 * Comparison of Proportional Hazards and Accelerated Failure Time Models, A Master of Science Thesis Submitted to the College of Graduate Studies and QJiezhi 2009 Saskatchewan Canada Department of Mathematics and Statistics University of Saskatchewan Saskatoon * Cox Regression Model SLindsay 2004 thesis submitted to Department of Mathematics, B.S., Virginia Polytechnic Institute and State University * Variate Generation for Accelerated Life and Proportional Hazards Models with Time Dependent Covariates LMLeemis LShih KReynertson 1989 Norman, OK 73019 University of Oklahoma * Survival Analysis with Long-Term Survivors RMaller XZhou 1996 Wiley New York * Bayesian Approaches to Correcting Bias in Epidemiological Data MBMonica 2011 dissertation submitted to Department of Statistical Science, Baylor University * Kernel Estimation in a Nonparametric Marker Dependent Hazard Model JPNielsen OBLinton Ann. Statist 23 7 1995 * Accelerated failure time models with application to data on TB/HIV co-infected patients in Nigeria OOOgungbola AAAkomolafe ZAMusa American J Epidemiol Public Health 2 1 2018. 2018 * MPagano KGauvreau Principles of Biostastics Belmont, Calif Wadsworth 1993 1st ed. * Essays on the assumption of Proportional Hazards in Cox Regression", dissertation for the degree of Doctor of Philosophy in Statistics IPersson 2002 at Upsala University * Rank-based Inference for Accelerated Failure Time Models JinZLin DYYing Z Biometrika 90 2003 * Modelling Survival in Acute Severe Illness Cox versus AFT models LMJohn DBAndrew JSPatricia HbnTamara Journal of Evaluation in Clinical Practice 1356 -1294 2006. 2006 * Induced smoothing for the semiparametric accelerated failure time model: Asymptotics and extensions to clustered data LMJohnson RLStrawderman Biometrika 96 2009 * Bootstrap application in proportional hazard models MLThomas 1993 Iowa State University Retrospective thesis and dissertation * Regression analysis of multivariate failure time data by modeling marginal distributions LJWei DYLin LWeissfeld Journal of the American Statistical Association 84 1989. 2013 Survival Analysis of Cardiovascular Diseases. Washington University in St. Louis 25. Yuanxin H. * Time-varying effects in survival analysis THScheike Handbook of Statistics NBalakrishnan CRRao Elsevier B.V., North Holland 2004 23 * Statistical methods and computing for Semi-parametric and Accelerated Failure Time Model with induced Smoothing CSy Han 2013 department of Statistics, University of Connecticut Graduate School