\[ \eta = g(\mu)\]
\[ \eta = \alpha + \sum_{j=1}^{p}\beta_j x_{ij}\]
\[ Y = g(\eta)^{-1} \]
resposta | resíduos | ligação |
---|---|---|
contínua | gaussiano | identidade |
contagem | poisson | log |
proporção | binomial | logit |
binária | binomial | logit |
\[ \eta = I(\hat{Y}) \] \[ \eta = \hat{Y} \]
\[ f(y) \] \[ f(\hat{Y} + \epsilon) \]
\[ f(\hat{Y}) \]
\[\log(\alpha + \sum\beta_i x_i )\]
pH | Biomass | Species | |
---|---|---|---|
1 | high | 0.4692972 | 30 |
2 | high | 1.7308704 | 39 |
3 | high | 2.0897785 | 44 |
31 | mid | 0.1757627 | 29 |
32 | mid | 1.3767783 | 30 |
33 | mid | 2.5510426 | 21 |
61 | low | 0.1008479 | 18 |
62 | low | 0.1385961 | 19 |
63 | low | 0.8635151 | 15 |
glm01 <- glm(Species ~ Biomass + pH + Biomass:pH, family = poisson, data= arv)
anova(glm01, test = "Chisq")
## Analysis of Deviance Table
##
## Model: poisson, link: log
##
## Response: Species
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 89 452.35
## Biomass 1 44.673 88 407.67 2.328e-11 ***
## pH 2 308.431 86 99.24 < 2.2e-16 ***
## Biomass:pH 2 16.040 84 83.20 0.0003288 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## glm(formula = Species ~ Biomass + pH + Biomass:pH, family = poisson,
## data = arv)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.4978 -0.7485 -0.0402 0.5575 3.2297
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.76812 0.06153 61.240 < 2e-16 ***
## Biomass -0.10713 0.01249 -8.577 < 2e-16 ***
## pHlow -0.81557 0.10284 -7.931 2.18e-15 ***
## pHmid -0.33146 0.09217 -3.596 0.000323 ***
## Biomass:pHlow -0.15503 0.04003 -3.873 0.000108 ***
## Biomass:pHmid -0.03189 0.02308 -1.382 0.166954
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 452.346 on 89 degrees of freedom
## Residual deviance: 83.201 on 84 degrees of freedom
## AIC: 514.39
##
## Number of Fisher Scoring iterations: 4
glm02 <- glm(Species ~ Biomass + pH, family = poisson, data= arv)
anova(glm01, glm02, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: Species ~ Biomass + pH + Biomass:pH
## Model 2: Species ~ Biomass + pH
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 84 83.201
## 2 86 99.242 -2 -16.04 0.0003288 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Intercept) Biomass
## 3.7681236 -0.1071298
\[ 3.77 - 0.107 * 0.47 \]
## [1] 3.717773
\[ 3.77 - 0.107 * 0.75 \]
## [1] 2.96465
\[ exp(3.717)\]
## [1] 41.17258
\[ exp(2.964) \]
## [1] 19.38792
\(exp(\eta)\) ou \(exp(\hat{\alpha} + \hat{\beta} * x\))
faça o modelo cheio usando a familia de ligação poisson(log)
avalie o sobre-dispersão do erro pela razão: Residual deviance/degrees of freedom
se o valor da razão for muito maior que 1, ajuste o modelo cheio novamente com a família quasipoisson
se a sobredispersão persistir uma alternativa e modelar o resíduo com a binomial negativa (um parâmetro a mais relacionado à agregação)
compare os modelos simplificados com o mais complexo usando anova
retenha o modelo mínimo adequado
retorne os coeficientes e preditos do modelo para escala original (antilog)
#(Dispersion parameter for poisson family taken to be 1)
# Null deviance: 452.346 on 89 degrees of freedom
# Residual deviance: 83.201 on 84 degrees of freedom
# AIC: 514.39
limite [0-1]
variância depende da média
\(\sigma^2 = npq\)
n = tentativas
p = sucesssos
q = falha
q = n - p
\[ log(\frac{p}{q}) = a + bx \]
\[ \eta = \log(\frac{p}{q}) \]
\[ \eta = \log(\frac{p}{1-p})\]
\[ \eta = \log(\frac{{a + bx}}{1 -{a + bx}})\]
\[ x \to \infty; y_p \to 1 \]
\[ x \to -\infty; y_p \to 0 \]
flowered | number | dose | variety |
---|---|---|---|
0 | 12 | 1 | A |
0 | 17 | 4 | A |
0 | 17 | 1 | B |
3 | 15 | 4 | B |
2 | 14 | 1 | C |
1 | 15 | 4 | C |
2 | 18 | 1 | D |
3 | 19 | 4 | D |
sucesso | falha |
---|---|
0 | 12 |
0 | 17 |
4 | 6 |
9 | 2 |
10 | 0 |
0 | 17 |
3 | 12 |
6 | 6 |
9 | 1 |
9 | 9 |
2 | 12 |
1 | 14 |
3 | 14 |
5 | 15 |
15 | 0 |
2 | 16 |
3 | 16 |
15 | 13 |
19 | 7 |
21 | 6 |
0 | 13 |
0 | 15 |
3 | 16 |
15 | 5 |
17 | 0 |
0 | 11 |
1 | 11 |
0 | 17 |
1 | 14 |
0 | 10 |
(sucesso, falha) ~ dose + variety + dose:variety
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: yb
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 29 303.350
## dose 1 197.098 28 106.252 < 2.2e-16 ***
## variety 4 9.483 24 96.769 0.0501 .
## dose:variety 4 45.686 20 51.083 2.863e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## glm(formula = yb ~ dose + variety + dose:variety, family = binomial,
## data = flor)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.6648 -1.1200 -0.3769 0.5735 3.3299
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.59165 1.03215 -4.449 8.64e-06 ***
## dose 0.41262 0.10033 4.113 3.91e-05 ***
## varietyB 3.06197 1.09317 2.801 0.005094 **
## varietyC 1.23248 1.18812 1.037 0.299576
## varietyD 3.17506 1.07516 2.953 0.003146 **
## varietyE -0.71466 1.54849 -0.462 0.644426
## dose:varietyB -0.34282 0.10239 -3.348 0.000813 ***
## dose:varietyC -0.23039 0.10698 -2.154 0.031274 *
## dose:varietyD -0.30481 0.10257 -2.972 0.002961 **
## dose:varietyE -0.00649 0.13292 -0.049 0.961057
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 303.350 on 29 degrees of freedom
## Residual deviance: 51.083 on 20 degrees of freedom
## AIC: 123.55
##
## Number of Fisher Scoring iterations: 5
## Analysis of Deviance Table
##
## Model 1: yb ~ dose + variety + dose:variety
## Model 2: yb ~ dose + variety
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 20 51.083
## 2 24 96.769 -4 -45.686 2.863e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 2 3 4 5 6
## -4.1790352 -2.9411862 -1.2907208 2.0102098 8.6120711 -1.4598882
## 7 8 9 10 11 12
## -1.2504982 -0.9713115 -0.4129381 0.7038087 -3.1769379 -2.6302485
## 13 14 15 16 17 18
## -1.9013292 -0.4434907 2.4721863 -1.3087930 -0.9853868 -0.5541786
## 19 20 21 22 23 24
## 0.3082377 2.0330705 -4.9001833 -3.6818046 -2.0572996 1.1917103
## 25 26 27 28 29 30
## 7.6897302 -4.5916515 -1.5296848 -3.3591677 -1.4165950 -5.3063095
## 1 2 3 4 5 6
## 0.015082316 0.050154735 0.215730827 0.881864884 0.999818136 0.188484430
## 7 8 9 10 11 12
## 0.222613918 0.274619176 0.398207832 0.669031659 0.040042874 0.067216871
## 13 14 15 16 17 18
## 0.129958109 0.390909521 0.922168829 0.212688893 0.271824227 0.364895476
## 19 20 21 22 23 24
## 0.576455053 0.884225776 0.007390197 0.024559161 0.113316872 0.767046813
## 25 26 27 28 29 30
## 0.999542708 0.010034395 0.178039802 0.033596235 0.195195932 0.004935716
## StudRes Hat CookD
## 9 3.6262074 0.1640033 0.2465669
## 10 -4.3712682 0.8793205 14.0234875
## 20 -3.0231527 0.6916136 2.1733865
## 24 -0.8177008 0.9513937 1.3097880
incidence | area | isolation |
---|---|---|
1 | 7.93 | 3.32 |
0 | 1.92 | 7.55 |
1 | 2.04 | 5.88 |
0 | 4.78 | 5.93 |
0 | 1.54 | 5.31 |
1 | 7.37 | 4.93 |
1 | 8.60 | 2.88 |
0 | 2.42 | 8.77 |
1 | 6.40 | 6.09 |
1 | 7.20 | 6.98 |
0 | 2.65 | 7.75 |
1 | 4.13 | 4.30 |
0 | 4.17 | 8.52 |
1 | 7.10 | 3.32 |
0 | 2.39 | 9.29 |
glmave <- glm(incidence~ area + isolation + area:isolation,
family=binomial, data = ave)
summary(glmave)
##
## Call:
## glm(formula = incidence ~ area + isolation + area:isolation,
## family = binomial, data = ave)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.84481 -0.33295 0.02027 0.34581 2.01591
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.0313 7.1747 0.562 0.574
## area 1.3807 2.1373 0.646 0.518
## isolation -0.9422 1.1689 -0.806 0.420
## area:isolation -0.1291 0.3389 -0.381 0.703
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 68.029 on 49 degrees of freedom
## Residual deviance: 28.252 on 46 degrees of freedom
## AIC: 36.252
##
## Number of Fisher Scoring iterations: 7
glmave01<- glm(incidence~ area + isolation,
family=binomial, data = ave)
anova(glmave01, glmave, test="Chisq")
## Analysis of Deviance Table
##
## Model 1: incidence ~ area + isolation
## Model 2: incidence ~ area + isolation + area:isolation
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 47 28.402
## 2 46 28.252 1 0.15043 0.6981
##
## Call:
## glm(formula = incidence ~ area + isolation, family = binomial,
## data = ave)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8189 -0.3089 0.0490 0.3635 2.1192
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 6.6417 2.9218 2.273 0.02302 *
## area 0.5807 0.2478 2.344 0.01909 *
## isolation -1.3719 0.4769 -2.877 0.00401 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 68.029 on 49 degrees of freedom
## Residual deviance: 28.402 on 47 degrees of freedom
## AIC: 34.402
##
## Number of Fisher Scoring iterations: 6
glmave02<- glm(incidence~ area,
family=binomial, data = ave)
anova(glmave02, glmave01, test="Chisq")
## Analysis of Deviance Table
##
## Model 1: incidence ~ area
## Model 2: incidence ~ area + isolation
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 48 50.172
## 2 47 28.402 1 21.77 3.073e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
glmave03<- glm(incidence~ isolation,
family=binomial, data = ave)
anova(glmave03, glmave01, test= "Chisq")
## Analysis of Deviance Table
##
## Model 1: incidence ~ isolation
## Model 2: incidence ~ area + isolation
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 48 36.640
## 2 47 28.402 1 8.2375 0.004103 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## glm(formula = incidence ~ area + isolation, family = binomial,
## data = ave)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8189 -0.3089 0.0490 0.3635 2.1192
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 6.6417 2.9218 2.273 0.02302 *
## area 0.5807 0.2478 2.344 0.01909 *
## isolation -1.3719 0.4769 -2.877 0.00401 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 68.029 on 49 degrees of freedom
## Residual deviance: 28.402 on 47 degrees of freedom
## AIC: 34.402
##
## Number of Fisher Scoring iterations: 6