*****Aleksandra Anić **** 3rd Decembre 2024 **# FIRST EXAMPLE MLOGIT ****example 1 from Stata mlogit — Multinomial (polytomous) logistic regression use https://www.stata-press.com/data/r18/sysdsn1, clear ***The insurance is categorized as either an indemnity plan (that is, regular fee-for-service insurance, which may have a deductible or coinsurance rate) or a prepaid plan (a fixed up-front payment allowing subsequent unlimited use as provided, for instance, by an HMO) label list insure // this command shows value labels, we observe that insure variable has the following 3 cathegories 1 indemnity, 2 prepaid and 3 uninsure tab insure // tabulate frequencies *****Compare the relative frequencies of the insure variable with the adjusted probabilities of a multinomial Logit model with only a constant term for each category. ****mlogit without explanatory variables is model with constant only mlogit insure, nolog // nolog option surpress log likelihood predict pind pprep punin // don't need to add p option, since pr is assumed after mlogit, predicted probabilities sum pind pprep punin drop pind pprep punin ***estimate mlogit model and calculate probabilities of whites and non-whites for prepaid mlogit insure nonwhite predict pind pprep punin sum pind pprep punin tab insure tab insure nonwhite, col tab pind tab pprep tab punin drop pind pprep punin *****calculate probabilities by using formulas ***probability that white person has prepaid dis exp(-0.1879)/(1+exp(-0.1879)+exp(-1.9419)) ***probability that nonwhite person has prepaid dis exp(-0.1879+0.6608)/(1+exp(-0.1879+0.6608)+exp(-1.9419+0.37796)) ****this is for more advanced users, you can skip it * note that /* at the beginning and */ at the end of lines are used to hide part of the code. For running the code, remove /* and */ /* ****calculate probabilities by using matrix algebra mlogit insure nonwhite mat coef=e(b) mat list coef dis coef[1,colnumb(coef,"Prepaid:nonwhite")] dis coef[1,colnumb(coef,"Prepaid:_cons")] ***nonwhite=0 ***probability of prepaid insurance for whites gen pprep_white=exp(coef[1,colnumb(coef,"Prepaid:_cons")])/(1+exp(coef[1,colnumb(coef,"Uninsure:_cons")])+exp(coef[1,colnumb(coef,"Prepaid:_cons")])) ***nonwhite=1 ***probability of prepaid insurance for nonwhites gen pprep_nonwhite=exp(coef[1,colnumb(coef,"Prepaid:_cons")]+coef[1,colnumb(coef,"Prepaid:nonwhite")])/(1+exp(coef[1,colnumb(coef,"Uninsure:_cons")] + coef[1,colnumb(coef,"Uninsure:nonwhite")])+exp(coef[1,colnumb(coef,"Prepaid:_cons")]+coef[1,colnumb(coef,"Prepaid:nonwhite")])) sum pprep* */ ***change base cathegory to prepaid and check probabilities mlogit insure nonwhite, base(2) predict pind pprep punin sum pind pprep punin tab insure ***CONCLUSION: sample average of predicted probabilities equals observed frequencies for the mlogit with constant *** it does not matter what is the base cathegory the results are the same **# Estimate marginal effects ***average marginal effect vs. marginal effect at the mean mlogit insure nonwhite margins, dydx(*) predict(outcome(2)) margins, dydx(*) predict(outcome(2)) atmean ****marginal effects sum up to 0, the following code uses matrix algebra to check that the sum of marginal effects is 0 /* mlogit insure nonwhite margins, dydx(*) predict(outcome(1)) matrix list r(table) scalar me1=r(table)[1,1] margins, dydx(*) predict(outcome(2)) scalar me2=r(table)[1,1] margins, dydx(*) predict(outcome(3)) scalar me3=r(table)[1,1] dis %3.2f me1+me2+me3 // ME sum up to 0 */ ***rrr option for mlogit displays odds ratios, the choice of base cathegory is irrelevant mlogit insure nonwhite,rrr base(2) ****odds ratios for alternatives A and B that are greater than 1 indicate that the alternative A is more likely, less than one that is less likely ***nonwhites are less likely to choose indemnity comparing with prepaid and less likely to choose it comparing with uninsure mlogit insure nonwhite,rrr ***nonwhites are more likely to choose prepaid comparing with indemnity and uninsure comparing with indemnity ***mlogit is used when all regressors are case-specific, i.e. age, male, nonwhite and site vary by individuals ****check IIA ****explanation in Green THE INDEPENDENCE FROM IRRELEVANT ALTERNATIVES ASSUMPTION chapter 18.2.4 *** est store NAME is the command that stores results. We store results from to mlogit models and compare them ****hausman test is used ****two options are added **** alleqs use all equations to perform test; default is first equation only **** include estimated intercepts in comparison; default is to exclude ****if a subset of the choice set truly is irrelevant, then, omitting it from the model altogether will not change parameter estimates systematically. If we fail to reject null hypothesis IIA assumption holds mlogit insure age male nonwhite i.site est store m mlogit insure age male nonwhite i.site if insure!=3 est store m3 mlogit insure age male nonwhite i.site if insure!=2 est store m2 mlogit insure age male nonwhite i.site if insure!=1 est store m1 hausman m m3, alleqs constant hausman m m2, alleqs constant hausman m1 m, alleqs constant ****Cameron & Triverdi, Microeconometrics using Stata, ch 15.4 **# Choice of fishing mode use D:\Microeconometrics_Master\Database\mus15data.dta, clear cd D:\Microeconometrics_Master\Results *we analyze data on individual choice of whether to fish using one of four possible modes: describe *** mode, price and crate, chosen fishing mode and corresponding price and catch rate for that mode ****d variables dummy variables ****p & q variables are alternative-specific variables, i.e. price and catch rate for each of the possible four fishing modes ***income is case specific variables *data are in wide form ***one observation per individual list * in 1 //one observation providing the data for all four alternatives for individual tab mode, sum(income) ***mlogit is used when we have case-specific explanatory variable and wide form mlogit mode income, base(1) nolog outreg2 using mlogit_fish.out, lab dec(3) replace excel auto(3) test income margins, dydx(*) predict(outcome(1)) margins, dydx(*) predict(outcome(2)) margins, dydx(*) predict(outcome(3)) margins, dydx(*) predict(outcome(4)) generate id=_n *convert to long form. For every individual we will have four observations corresponding to the four fishing mode reshape long d p q, i(id) j(fishmode beach pier private charter) string drop mode price crate //case-specific variables that are not needed *****we have alternative-specific regressors for price and quality *****case specific regressor income ***alternative-specific conditional logit model asclogit d p q, case(id) alternatives(fishmode) casevars(income) basealternative(beach) test p=q=0 // if we fail to reject H0, it means that CL and MNL are equal ***calculate Pseudo R-squared, not displayed in output results ***formula for Pseudo R2= 1-e(ll1)/e(ll0), e stands for ereturn list scalar ll1=e(ll) ***** intercept only model, no case nor alternative specific variables asclogit d, case(id) alternatives(fishmode) basealternative(beach) scalar ll0=e(ll) scalar PseudoR2=1-ll1/ll0 *Alternatives summary for fishmode estat alternatives predict prob, pr table fishmode, stat(mean d prob) nototal nformat(%5.4f) ***without alternative-specific regressors aslogit command gives the same estimates as mlogit asclogit d, case(id) alternatives(fishmode) casevars(income) basealternative(beach) outreg2 using mlogit_fish.out, lab dec(3) append excel auto(3)