Memoire Online - Analyzing how to shift Informal Unit of Production (IUP) to formality:the case of Cameroon

2. Methodological issues

Methodological issues in this paper stand on the concept of informality and the identification of dynamism amongst the IPUs through discriminatory and multivariate analyses.

The concept of informality

The debate on a universal definition of informality is still pending. The term «informal» was used for the first time by Hart in 1971. It has been reemployed by the ILO in its report on Kenya in 1972. This evocation has underlined seven criterions to identify IUPs: exclusive use of local resources, family ownership of the unit, reduced scale of the activity, use of techniques that are essentially man power demanding, skills of the manpower are acquired out of formal training institutions and highly competitive markets without regulation. These characteristics were too numerous for a single unit to meet them all. Further criterions were therefore restricted to the scale and the lawfulness of the unit. The criterion of scale is the most easy to mobilize because it requires just a unit to have less that a threshold of employees (usually, 10). The scale criterion is not appropriate for international comparisons though, and doesn't take into account the smallness of enterprises like attorney offices, notaries, accountants that are modern and most of the time very profitable. To avoid that insufficiency, the criterion of legality has been settled. According to this criterion, an IUP is the one that does not respect the law, the pending question here still being; which laws among the numerous existing are required? This led to the ILO combining the criterions of smallness in terms of employment and non registration of the unit or of the regular workers. The survey 1- 2-3 that we will use in this paper has considered informal, any activity without a tax payer identification number and/or not handling written accounts according to the scheme required by the law.

Measuring the economic dynamism of IUPs

Among the possible variables like sales, numbers of employees, etc, profits have been chosen as the variable to discriminate between the less and the more effective. The less effective group will be constituted of IUPs that make monthly profits which are less than the nationwide median, the more effective being those with monthly profits which are more than the nationwide median. The profit is defined as the difference between sales and costs (mainly salaries and taxes). After we decided on the discrimination criterion, the concern was now to extract from the huge database the more relevant variables likely to explain the ranking in one group or another. The Principal components analysis (PCA) has been operated to realize the variable specification. The PCA like the factorial analysis are statistical tools that summarize the variability among a set of numerous variables. In fact, they seek to describe the variation of a given set of variables as linear combinations of the original variables in which each linear

combination is aimed at explaining a maximum of variation of original variables without being correlated to the other linear combinations. Most of the time, analysts just focus on the first two linear combinations that by definition explain most of the variability. It is therefore possible to scatter plot the IUPs according to the two axes obtained from the first two linear combinations and to represent the variables in the circle of correlation comprising the above mentioned axes.

The next step was to apply the multivariate discriminatory analysis techniques to differentiate the two groups of IUPs so that an anonymous IUP could be ranked in the appropriate group knowing only some core characteristics. For this purpose we both operated the so called credit scoring techniques and the logistic regression. The credit scoring is used in several areas like medicine, meteorology or finance, the latest using it to identify solvable clients. It consists of performing comparison tests using the Wilks' Lambda (£) as statistics' test on the core variables identified through the PCA process. Its applicability requires the observance of two hypotheses that are the equality of the covariance matrix of the two groups and the normality of the distribution of each population group. If £ tends towards 1 its influence on the differentiation is not relevant, in the contrary, the further it goes below 1, the more it influences the differentiation. Mindful that the Credit score technique requires the observance of these strong hypotheses, it is easier to cross over those requirements by applying a logistic

regression. The Logit function is defined as

LogitP fi fi X

= + where designates the

i i

i = 1

coefficients, i the index of the variable, X the variable, p the number of variables and P the probability of being ranked in the effective group. The above equality corresponds to the expression: P(Y=1/X=x) = 1/(1 _+e-(/31x1+...+ /3pxp)_.

The estimation of coefficients uses the maximum likelihood. The normality of the distributions of variables is required. We ranked an IUP in the effective group if its probability was more than 0.5. From the above process we could deduce the score of effectiveness defined as S(x) = /31x1+... + /3pxp and then rank the IUP according to their results in the scoring process.

We have deliberately chosen just to display the results obtained from the logistic regression because they have been found more relevant than the credit scoring method. In fact, the matrix of confusion of the logistic regression was stronger than the credit scoring one.

"La première panacée d'une nation mal gouvernée est l'inflation monétaire, la seconde, c'est la guerre. Tous deux apportent une prospérité temporaire, tous deux apportent une ruine permanente. Mais tous deux sont le refuge des opportunistes politiques et économiques" Hemingway