Situation: Yes / No or categorical outcomes, being compared across groups
Data Preparation
0-1 coding: Ensure that categorical outcome and exposure variables are coded as : 0 = no, 1 = yes
. While this is not required for Chi-square, logistic regression etc, It is a pre-requisite for using epidemiological analysis cc
and cs
commands that can provide results in the form of risk difference, Odds ratio, risk ratios etc.
Ensure that categorical groups are coded in increments of 1. What I mean to say is that 0=illiterate, 1=primary school, 3 = middle school
is bad,
is good. How do you check it – use 0=illiterate, 1=primary school, 2 = middle school
codebook
or tab1 var, nolabel
Understand the data
A key step is to understand which participant groups have higher or lower levels of outcomes.
tab outcomeVar groupVar, col
Check whether the two groups have same proportion of outcome
prtest outcomeVar, by(groupVar)
Hypothesis testing using Chi Square
tab outcomeVar groupVar, col chi
Use Exact tests if you get a message that one or more cells have an expected value of < 5
tab outcomeVar groupVar, col chi exact
Wondering that you are getting the same p value on Chi-square and prtest… well that is expected. The advantage of the prtest command is that you also get the 95% CIs of the proportions.
Comparing yes/no outcome across two groups only
Odds ratios Calculation: cc outcomeVar groupVar
or logit pneumonia i.vaccine, or
Risk Ratio Calculation: cs outcomeVar groupVar
Try These out !
preserve
use https://www.stata-press.com/data/r17/pneumoniacrt, clear
describe pneumonia vaccine
codebook pneumonia vaccine
count
tab pneumonia vaccine, col
tab pneumonia vaccine, col chi
prtest pneumonia, by(vaccine)
cc pneumonia vaccine
cs pneumonia vaccine
logit pneumonia i.vaccine, or
restore
Code language: JavaScript (javascript)
Comparing yes/no Outcome Across Three or more groups
In this case, we can use Mantel-haenzel techniques. tabodds
and mhodds
are your friends. Alternatively, you could just run a logistic regression .