gev                   package:VGAM                   R Documentation

_G_e_n_e_r_a_l_i_z_e_d _E_x_t_r_e_m_e _V_a_l_u_e _D_i_s_t_r_i_b_u_t_i_o_n _F_a_m_i_l_y _F_u_n_c_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum likelihood estimation of the 3-parameter generalized
     extreme value (GEV) distribution.

_U_s_a_g_e:

     gev(llocation = "identity", lscale = "loge", lshape = "logoff",
         elocation = list(), escale = list(),
         eshape = if(lshape=="logoff") list(offset=0.5) else
         if(lshape=="elogit") list(min=-0.5, max=0.5) else list(),
         percentiles = c(95, 99),
         iscale=NULL, ishape = NULL,
         method.init = 1, gshape=c(-0.45, 0.45), tshape0=0.001, zero = 3)
     egev(llocation = "identity", lscale = "loge", lshape = "logoff",
          elocation = list(), escale = list(),
          eshape = if(lshape=="logoff") list(offset=0.5) else
          if(lshape=="elogit") list(min=-0.5, max=0.5) else list(),
          percentiles = c(95, 99),
          iscale=NULL,  ishape = NULL,
          method.init=1, gshape=c(-0.45, 0.45), tshape0=0.001, zero = 3)

_A_r_g_u_m_e_n_t_s:

llocation, lscale, lshape: Parameter link functions for mu, sigma and
          xi respectively. See 'Links' for more choices.

elocation, escale, eshape: List. Extra argument for the respective
          links. See 'earg' in 'Links' for general information. For the
          shape parameter, if the 'logoff' link is chosen then the
          offset is called A below; and then the linear/additive
          predictor is log(xi+A) which means that xi > -A. For
          technical reasons (see *Details*) it is a good idea for
          A=0.5.

percentiles: Numeric vector of percentiles used for the fitted values.
          Values should be between 0 and 100. However, if
          'percentiles=NULL', then the mean mu + sigma *
          (gamma(1-xi)-1)/xi is returned, and this is only defined if
          xi<1.

iscale, ishape: Numeric. Initial value for sigma and xi. A 'NULL' means
          a value is computed internally. The argument 'ishape' is more
          important than the other two because they are initialized
          from the initial xi. If a failure to converge occurs, or even
          to obtain initial values occurs, try assigning 'ishape' some
          value (positive or negative; the sign can be very important).
          Also, in general, a larger value of 'iscale' is better than a
          smaller value.

method.init: Initialization method. Either the value 1 or 2. Method 1
          involves choosing the best xi on a course grid with 
          endpoints 'gshape'. Method 2 is similar to the method of
          moments. If both methods fail try using 'ishape'.

  gshape: Numeric, of length 2. Range of xi used for a grid search for
          a good initial value for xi. Used only if 'method.init'
          equals 1.

 tshape0: Positive numeric. Threshold/tolerance value for resting
          whether xi is zero. If the absolute value of the estimate of
          xi is less than this value then it will be assumed zero and
          Gumbel derivatives etc. will be used.

    zero: An integer-valued vector specifying which linear/additive
          predictors are modelled as intercepts only. The values must
          be from the set {1,2,3} corresponding respectively to mu,
          sigma, xi. If 'zero=NULL' then all linear/additive predictors
          are modelled as a linear combination of the explanatory
          variables. For many data sets having 'zero=3' is a good idea.

_D_e_t_a_i_l_s:

     The GEV distribution function can be written

          G(y) =  exp( -[ (y- mu)/  sigma  ]_{+}^{- 1/  xi})

     where sigma > 0, -Inf < mu < Inf, and 1 + xi*(y-mu)/sigma > 0.
     Here, x_+ = max(x,0). The mu, sigma, xi are known as the
     _location_, _scale_ and _shape_ parameters respectively. The cases
      xi>0, xi<0, xi=0 correspond to the Frechet, Weibull, and Gumbel
     types respectively. It can be noted that the Gumbel (or Type I)
     distribution accommodates many commonly-used distributions such as
     the normal, lognormal, logistic, gamma, exponential and Weibull.

     For the GEV distribution, the kth moment about the mean exists if
     xi < 1/k. Provided they exist, the mean and variance are given by
     mu + sigma { Gamma(1-xi)-1} / xi and sigma^2 { Gamma(1-2 xi) -
     Gamma^2 (1- xi) } / xi^2 respectively, where Gamma is the gamma
     function.

     Smith (1985) established that when xi > -0.5, the maximum
     likelihood estimators are completely regular. To have some control
     over the estimated xi try using 'lshape="logoff"' and the
     'eshape=list(offset=0.5)', say, or 'lshape="elogit"' and
     'eshape=list(min=-0.5, max=0.5)', say.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm', and 'vgam'.

_W_a_r_n_i_n_g:

     Currently, if an estimate of xi is too close to zero then an error
     will occur for 'gev()' with multivariate responses. In general,
     'egev()' is more reliable than 'gev()'.

     Fitting the GEV by maximum likelihood estimation can be
     numerically fraught. If 1 + xi*(y-mu)/sigma <= 0 then some crude
     evasive action is taken but the estimation process can still fail.
     This is particularly the case if 'vgam' with 's' is used; then
     smoothing is best done with 'vglm' with regression splines ('bs'
     or 'ns') because 'vglm' implements half-stepsizing whereas 'vgam'
     doesn't (half-stepsizing helps handle the problem of straying
     outside the parameter space.)

_N_o_t_e:

     The 'VGAM' family function 'gev' can handle a multivariate
     (matrix) response. If so, each row of the matrix is sorted into
     descending order and 'NA's are put last. With a vector or
     one-column matrix response using 'egev' will give the same result
     but be faster and it handles the xi=0 case. The function 'gev'
     implements Tawn (1988) while  'egev' implements Prescott and
     Walden (1980).

     The shape parameter xi is difficult to estimate accurately unless
     there is a lot of data. Convergence is slow when xi is near -0.5.
     Given many explanatory variables, it is often a good idea to make
     sure 'zero=3'.  The range restrictions of the parameter xi are not
     enforced; thus it is possible for a violation to occur.

     Successful convergence often depends on having a reasonably good
     initial value for xi. If failure occurs try various values for the
     argument 'ishape', and if there are covariates,  having 'zero=3'
     is advised.

_A_u_t_h_o_r(_s):

     T. W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Yee, T. W. and Stephenson, A. G. (2007) Vector generalized linear
     and additive extreme value models. _Extremes_, *10*, 1-19.

     Tawn, J. A. (1988) An extreme-value theory model for dependent
     observations. _Journal of Hydrology_, *101*, 227-250.

     Prescott, P. and Walden, A. T. (1980) Maximum likelihood
     estimation of the parameters of the generalized extreme-value
     distribution. _Biometrika_, *67*, 723-724.

     Smith, R. L. (1985) Maximum likelihood estimation in a class of
     nonregular cases. _Biometrika_, *72*, 67-90.

_S_e_e _A_l_s_o:

     'rgev', 'gumbel', 'egumbel', 'guplot', 'rlplot.egev', 'gpd',
     'elogit', 'oxtemp', 'venice'.

_E_x_a_m_p_l_e_s:

     # Multivariate example
     data(venice)
     y = as.matrix(venice[,paste("r", 1:10, sep="")])
     fit1 = vgam(y[,1:2] ~ s(year, df=3), gev(zero=2:3), venice, trace=TRUE)
     coef(fit1, matrix=TRUE)
     fitted(fit1)[1:4,]
     ## Not run: 
     par(mfrow=c(1,2), las=1)
     plot(fit1, se=TRUE, lcol="blue", scol="forestgreen",
          main="Fitted mu(year) function (centered)")
     attach(venice)
     matplot(year, y[,1:2], ylab="Sea level (cm)", col=1:2,
             main="Highest 2 annual sealevels and fitted 95 percentile")
     lines(year, fitted(fit1)[,1], lty="dashed", col="blue")
     detach(venice)
     ## End(Not run)

     # Univariate example
     data(oxtemp)
     (fit = vglm(maxtemp ~ 1, egev, data=oxtemp, trace=TRUE))
     fitted(fit)[1:3,]
     coef(fit, mat=TRUE)
     Coef(fit)
     vcov(fit)
     vcov(fit, untransform=TRUE)
     sqrt(diag(vcov(fit)))   # Approximate standard errors
     ## Not run:  rlplot(fit) 

