Regression analysis of the factors affecting economic profitability of banana passion fruit production Análisis de regresión de los factores que afectan la rentabilidad económica de la producción de curuba

Banana passion fruit production in Colombia contributes to the development of the country’s economy as it generates income and provides raw material for the agro-based industries. Profitability is a key aspect of economic efficiency and plays an important role in farmer´s decision-making therefore; the purpose of the study was to determine the different factors affecting the economic profitability of banana passion fruit production. A descriptive, quantitative, correlational and non-experimental design was selected and a regression analysis was performed. Results showed that the stationarity, normality, homoscedasticity and non-autocorrelation assumptions were not violated; all estimated coefficients were statistically significant and consistent with hypothesized sign. As such, the independent variables contributed to explain individually banana passion fruit economic profitability.


INTRODUCTION
Passion fruit is a tropical and subtropical species native to Central and South America, but can occur in the Southeast Asia, Australia and the Southern Pacific Islands. The most northern occurrence is the USA and the southernmost distribution is Argentina, with Colombia and Brazil the two countries where the greatest number of species are to be found (Parra and Cancino-Escalante, 2019).
It is difficult to quantify precisely the passion fruit market due to the absence of reliable and continuous statistical data, however, it is estimated that global production reached an average 1,468.8 million tons for the period 2015 -2017 (Altendorf, 2017). For most of the Latin American countries, domestic consumption is high and production is mainly directed to local markets, although an average 12.000 tons is exported to Europe and North America (Altendorf, 2017;Parra and Cancino-Escalante, 2019).
The Andean passion fruits, including the banana passion fruit, are highly valued in the commercial market not only for its edible fruits, but also due to its nutritional and medicinal properties and flowers of ornamental value (Salazar and Ramírez, 2017). In Colombia, it is estimated that banana passion fruit represents one third of the country´s passion fruit production with a cultivated area of 1,382.5 hectares, production of 15,554.2 tons and 11.25 yield in 2018. The majority of the producers are located in Boyacá and North of Santander (86.48%) with Cundinamarca, Huila and Nariño accounting for 13.52% (Agronet, 2020).
Undoubtedly banana passion fruit production contributes to the development of the country's economy as it not only generates income for many farmers but also provides raw material for the agro-based industries (Panin and Hlophe, 2013;Xaba and Masuku, 2013). Despite its economic importance, current research on banana passion fruit has focused mainly on the agronomic aspects, much to the exclusion of other important factors such as economic profitability, most probably because the data on the prices of inputs, the quantities of production and selling prices obtained are lacking or inaccurate.
Nonetheless, profitability plays an important role in farmer´s decision-making and is also a key aspect of economic efficiency and, as such, is influenced by several factors, which includes yield, prices, farm size and location, production costs, level of output and variety of seed, among others (Bumbescu, 2015). It can be argued that profitability is, in fact, the primary objective for producers and regardless of the different types of economic activities; it must be measured and evaluated. According to Faga and Ramos (2006), it emerges from the difference between the surplus of total revenue (money generated from sales) and the cost of producing the good, whereas Geamănu (2011) defines it as the ability of a business to generate profit from its economic activity, by assigning its resources efficiently. It is based on the neoclassical economic theory of the profit function, which assumes that firms maximize profit given input and output prices subject to given technology therefore the profit maximization problem, in which variable costs are deducted from gross revenues, can be specified as (1) (Lansink and Peerlings, 2001): Where, y and x corresponds to the vectors of quantities of outputs and variables inputs; p and w refers to the vectors of prices; z denotes the quantity of factors that are assumed to be fixed in the short term.
In view of the above, it is evident that profitability is highly connected to the firms' total revenue and production costs and as such, it is an essential indicator of the firm´s production decision and its long-term survival (Geamănu, 2011). However, despite its importance it is important to highlight that in the review of literature most of the studies on profitability was found to place greater emphasis on the financial aspects such as net present value and gross margin analysis and very few on economic factors. Nevertheless, Flórez and Miranda (2017) applied a linear regression model on a sample of ten producer sectors in the Peruvian jungle in order to determine the factors that influence economic profitability of camu camu based on the analysis of production cost, age, yield, and seed density. The study found that production cost parameter did not present the correct signal, and that the coefficient of determination (15.7%) and the correlation between the proposed variables and economic profitability were very low. In addition, the F-statistic was not significant at 5% level; therefore, the authors concluded that the production of camu camu was not a viable option to improve farmers' income.
Likewise, Ramírez and Ávila (2013) analyzed the economic profitability of quinoa for the region of Puno, Peru also employing the ordinary least square method. The results of the study showed that the income, production costs and yield coefficients were statistically significant, and jointly explained 97% of the variability in profitability. In addition, the assumption of no serial correlation, homoscedasticity and normality were not violated. The authors also identified that other variables such as price, occupation and education were not a strong predictor of profitability.
Cancino, Cancino-Escalante and Quevedo-García (2018) also assessed the economic profitabilty of twenty seven peach producers in the department of North of Santander, Colombia using econometric tools. According to their findings production costs, income, yield and tree age presented the correct signals and were statistically significant at a 5% level. Moreover, the study confirmed that the time series were stationary and the Granger test showed that there was a unidirectional causality from tree age towards yield and income.
In view of the above, the aim of the present study was to determine the different factors affecting the economic profitability of banana passion fruit production using a multiple regression analysis technique. It is necessary however, to emphasize that this is the first empirical study on the economic profitability of banana passion fruit in Colombia.

MATERIALS AND METHODS
A descriptive, quantitative, correlational and non-experimental design was selected to describe relationships among variables. Secondary data was obtained from official and governmental institutions, namely the Colombian Ministry of Agriculture and Rural Development, the National Administrative Department of Statistics (DANE), FedePasifloras and the Information System of Prices and Supply of the Agricultural Sector (SIPSA).

DATA
In view of data limitations the sample period runs from 2007 to 2018 and the variables used were banana passion fruit production (Q) in kilogram per hectare (Kg/ha), domestic price (P) in Colombian pesos per kilogram (COP/Kg) and production costs (PC) in Colombian pesos per hectare (COP/ha), which includes rent on land, wages of hired labor, chemical and organic fertilizers, planting material and cost of equipment. In addition,total revenue (COP/Kg) was calculated as the result of multiplying price and production whilst profitability as the difference between total revenue and production costs over production costs measured in percentage (%).

Model Specification
To determine the factors that affect the profitability of banana passion fruit a multiple regression equation was employed in order to examine the magnitude and the direction of the independent variables. The empirical model specification is as shown (2): = β 1 − β 2 PC + β 3 P + β 4 Q + β 5 TR + µ (2) β1 = The constant term representing the expected response of profitability when PC, P, Q and TR are zero. β2 = Describes the response of profitability to a change in one unit of production cost, all other variables held constant (Ceteris paribus). β3 = Measures the change in value of profitability with respect to given a unit change in sales price, ceteris paribus. β4 = Describes the effect of a change in value of profitability given a unit change in production level, ceteris paribus. β5 = The percentage change in profitability given a unit change in total revenue, ceteris paribus. µ = Error term. Represents the effect of the variables that were omitted from the equation.
The sign of each coefficient indicates the direction of the relationship between the independent and the response variables. A positive sign for P, Q and TR is expected as it indicates a direct relationship with profitability. As for PC it should present a negative sign as it implies that higher production costs for banana passion fruit reduces farmer´s profitability.

Stationarity test
A time series is said to be stationary if its statistical properties (mean, variance, and covariance) do not change with time, therefore the study employed two approaches in order to identify which econometric model best fits the data. The first approach was the stationarity test proposed by Kwiatkowski-Phillips-Schmidt-Shin (1992) (KPSS) that considers as null hypothesis that the series are stationary and the second one was the unit root tests of Dickey-Fuller (1979) (ADF) and the Phillips-Perron (1988) (PP), for which the null hypothesis implies that the series have a unit root and thus are non-stationary.

Regression Specification Error Test (RESET)
A common problem in regression models is functional misspecification and/or omitted variables that can lead to biased and inefficient estimators (Babatunde, Ikughur, Ogunmola and Oguntunde, 2014). Therefore, the Regression Specification Error Test (RESET), proposed by Ramsey (1969) was used in order to test whether non-linear combinations of the explanatory variables help explain the response variable. The test evaluates the null hypothesis of no misspecification against the alternative of misspecification.

CUSUM and CUSMQ tests
The structural stability of the coefficients was analyzed using the cumulative sum (CUSUM) and the cumulative sum square (CUSUMQ) tests as they are a useful diagnostic tools for parameter constancy (Ravinthirakumaran, Selvanathan, Selvanathan, and Singh, 2015). As such, both tests are based on recursive residuals and have an advantage over other tests, as it does not require prior knowledge of the structural break point (Turner, 2010). Goldfeld and Quandt (1979) argues that although the power of the test can be low and erratic as well as sensitive to specification errors of the residuals they are an important part of the estimation procedure and provides valuable addition to the different evaluation techniques.

Normality, multicollinearity, heteroscedasticity and autocorrelation tests
The Jarque-Bera test (1987) and the variation inflation factor (VIF) indice were employed in order to assess the existence of normality of the errors (u~ N (0,  2 )) and the presence of multicollinearity (when two or more independent variables in a multiple regression equation are correlated), respectively. Likewise, the Breusch-Pagan, Godfrey and the Glejser tests were used to detect the presence of heteroscedasticity (when the variance of the error term in the regression model varies) whereas both the Breusch-Godfrey LM and the Q-statistic Ljung-Box tests examined the assumption of no correlation (no identifiable relationship between the values of the error terms).

Granger Causality Test
The study also employed the Granger (1969) causality test in order to assess the cause-effect relationship and the direction of causality between variables. It is based on the concept that if a variable X "Granger-causes" another variable Y, then past values of Y and past values of X are useful in predicting Y, of which four outcomes are possible (Table 1):

Descriptive Statistics
Statistical data in Table 2 shows the per-hectare mean profitability, cost, price, production, total revenue and yield for banana passion fruit. The total average production cost for the period analyzed (2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) was estimated at $9,852,573 per hectare (this includes rent on land, wages of hired labor, chemical and organic fertilizers, planting material and cost of equipment). The maximum and minimum profitability values achieved by farmers was as high as 223% and as low as 97% per hectare. In addition, banana passion fruit prices fluctuated from $934 to $1.761 per kilo and the average production was 16,020.5 (Kg/ha) with a minimum yield of 8.9 (Kg/ha) and a maximum of 11.6 (Kg/ha).

Stationarity Test
As a first step to determine the correct specification of the econometric model, it is important to examine the stationarity of the variables employed in order to avoid problems of spurious regressions. Therefore, all variables were tested using the ADF and the PP unit root test as well as the KPSS stationarity test. The results shown in Table 3 indicate that in the level form for the ADF and PP tests all values in absolute terms were higher than the critical values at a 5% significance level, with the exception of production costs where it is significant at a 1% level for both constant, constant and linear trend. For the KPSS test, all series presented a stationarity property at a 1% level of significance; consequently as a whole, the results suggest that the variables are integrated of order cero I (0) which implies that a least square regression model can be employed.

Model Estimation
After establishing evidence that the series are integrated at I(0) the next step was to estimate the regression model (Table 4). Results showed that the p-values associated to the t-statistic of all variables were highly significant (p < 0.05) and 98% of the variability observed in profitability of banana passion fruit production can be explained by the joint effect of the independent variables. This indicates a good explanatory power of the model which is further confirmed by the significance of the F-value (p = 0,000).  (3) presented coefficients consistent with the hypothesized signs and of reasonable magnitudes. As expected, the regression coefficient for production cost was negatively related to profitability and P, Q and TR presented a positive relationship.

Ramsey RESET
Testing the assumptions of the regression is also important to assess the goodnessof-fit of data as well as particular aspects of the regression model. As such, the Ramsey RESET test showed that the F-statistic of 0.14 was quite low, the p-value of 0.9 was higher than the required level of significance (0.05) and that the coefficient of the squared value of the fitted term was not statistically significant (t-stat = 0.195) supporting therefore, the evidence of a correct specification

Stability Test
Concerning the model´s structural stability Figure 1 plots the results for the CUSUM and the CUSUMQ tests, which indicates that in both charts the statistics are within the 5% significance boundary lines, suggesting the absence of any instability of the coefficients.

Figure 1. CUSUM and CUSUMSQ Plot
Source: Own elaboration Normality, multicollinearity, heteroscedasticity and autocorrelation tests The adequacy of the model and the residuals were verified using different diagnostic statistic tests. In terms of normality in error term the Jarque-Bera statistic indicated that the residuals from the regression are normally distributed (JB = 0.58; p-value = 0.74 > 0.05). As for multicollinearity results showed that, no variable exceeded the VIF rule of thumb limit of 10, suggesting that there is no problem of collinearity (Table 5).
Another critical aspect encountered in regression models is the estimation of linear equations that contain heteroscedasticity, which can lead to asymptotically inefficient results. Therefore, the Breusch-Pagan, Godfrey (BPG) and Glejser tests were applied and the results showed that the null hypothesis for no heteroscedasticity for both tests was not rejected even at a 20 percent level of significance (Table 5). Furthermore, when examining serial correlation the current versus lagged residuals shown in Figure 2 reveals that the dots are mainly concentrated in the fourth (southwest) quadrant, suggesting a positive correlation. The graphical method however, is informal and subjective; therefore, alternative methods to identify serial correlation such as the Breusch-Godfrey LM the Q-statistic Ljung-Box and the correlogram tests were used. For the Breusch-Godfrey LM test with two lags the probability of the chi-square (p = 0.184) was not significant, thus accepting the null hypothesis of no serial correlation. The Q-statistic Ljung-Box reinforces this result, as p-values of all observations were also not significant. As for the correlogram the spikes in autocorrelations and partial autocorrelations did not exceed the confidence bounds ( Figure 3).

Figure 3. Serial correlation diagnostic test
Source: Own elaboration

Granger Causality test
The study also evaluated the cause-effect relationship and, based on the results obtained from the Granger-test with two lags, it can be concluded that for the majority of the time series (9 out 10) there are no causal relationship between variables as for both directions the p-values are higher than the established 5% significance level. Nevertheless, the test indicates a unidirectional causality running from PC to profitability, thus it can be argued that PC can contribute to the prediction of banana passion fruit economic profitability (Table 6).

DISCUSSION
The agricultural sector plays an important role in many least developed countries, especially for those that are largely agrarian, as it contributes to economic growth. It is an important source of income for farmers and its linkage with other non-agricultural sectors promotes job creation, consequently reducing poverty (Abdulaiand Hazell, 1996). Therefore, in order to remain competitive and become more efficient, where every resource is allocated optimally, farmer´s need to understand the nature and degree of the factors that affect crop profitability.
As a result, according to the proposed econometric it is evident that, with respect to the explanatory variables, their estimated coefficients were significantly different from zero at a 5% significance level, consistent with hypothesized signs and, as expected, production cost was negatively related to economic profitability, and P, Q and TR presented a positive relationship. These findings confirm those of Flórez and Miranda (2017) and Cancino, Cancino-Escalante and Quevedo-García (2018) who in their studies identified that, for crops such as camu camu and peach, production costs also presented a negative impact on profitability suggesting therefore, that farmers should focus on cost reduction strategies through, for example, innovative practices.
Furthermore, it can be argued that the model passed all statistic diagnostic checks where the ADF, PP and the KPSS tests indicated the presence of stationarity. Likewise, the normality and stability of the residuals were confirmed therefore, implying that the proposed independent variables contributed to explain individually banana passion fruit profitability and as such are important predictors.

CONCLUSIONS
The present study contributes to understand and explore the factors that affect banana passion fruit profitability by means of a regression analysis. Undoubtedly, the research findings are of great interest as it is not only a useful tool to identify the variables that affect banana passion fruit economic profitability, but it to also provides valuable information for farmers on how to better allocate production resources. Nonetheless, it is recommended that future research should be undertaken in order to include additional factors such as opportunity and depreciation costs as well as a larger data set and, thus, provide a higher level of accuracy.