Statistical Problems at Google: Estimation of Web Page Change Rates
10/25/2007
Eric Chi
Amgen
Drug Development and Some Statistical Applications in the Biopharmaceutical Industry
11/1/2007
Kevin J. Grimm
Psychology, UC Davis
Multitrait-Multimethod Models for Developmental Research
11/8/2007
Chuck McCulloch
Biostatistics, UCSF
Prediction of Random Effects and Effects of Misspecification of their Distribution
11/15/2007
Javier Rojo
Rice University
11/16/2007
Kaare Christensen
Odense, Denmark
11/29/2007
Ching-Shui Cheng
UC Berkeley
Abstracts
Most recent seminar first.
Fall 2007
Thursday November 1st, 2007
Speaker:
Kevin J. Grimm (Psychology, UC Davis)
Title:
Multitrait-Multimethod Models for Developmental Research
Abstract:
In this talk, I will present some of my most recent work extending models for longitudinal data analysis to investigate the development as well as the prediction of changes in children’s behavior through elementary school. Specifically, multitrait-multimethod (MTMM) confirmatory factor models and latent growth curves were combined to examine propositions of trait and method stability in mother, father, and teacher-reported checklists of child behaviors. Data from the NICHD Study of Early Child Care and Youth Development (SECCYD) were analyzed. In the SECCYD the Child Behavior Checklist, an informant based questionnaires of children’s behavior, were completed by the study child’s mother, father, and teacher in first, third, fourth, and fifth grades. The behavior rating scales were shown to be influenced by the traits (i.e., aggression, delinquency, depression, withdrawn) as well as the informant who completed the form. Longitudinally, the between-person differences in the traits and methods were found to be stable through elementary school. The longitudinal MTMM model was then combined with second-order latent curve models to evaluate change and then child (e.g., gender) and paternal (e.g., depression) characteristics were included as predictors of trait change and informant-based variance. The covariates were shown to be related to both traits and informants suggesting a more accurate prediction of the influences on children’s development and informant-based bias in the CBCL. Methodological extensions of longitudinal MTMM models and benefits of an MTMM approach to developmental research are discussed.
Drug Development and Some Statistical Applications in the Biopharmaceutical Industry
Abstract:
Statistics is a well recognized scientific tool to extract information from study, experimental or observational, data. Many such studies are conducted in the biopharmaceutical industry for its development of candidate products. An overview of this development will be provided, and statistical applications in some specific areas will be highlighted.
Statistical Problems at Google: Estimation of Web Page Change Rates
Abstract:
Search engines strive to maintain a "current" repository of all pages on the web to index for user queries. However, crawling all pages all the time is costly and inefficient: many small websites don't support that much load, and while some pages change very rapidly, others don't change at all.
As a result, estimated frequency of change is often used to decide how often a web page needs to be crawled. Here we consider different ways to estimate rates of change from censored data observed by a web crawler, and examine the statistical performance of those estimators with respect to correctly crawling a newly discovered page.
We will look at a resampling framework for spatial data where prior to performing the bootstrap, each event of a spatial point process or observation location of a random field is first assigned a mark. These marks depend on the observed data and on the particular statistic of interest. In the actual bootstrap, the marks are resampled together with the points, and bootstrap estimates computed from these marks. This "Marked point bootstrap" is very similar to the block-of-blocks bootstrap briefly described in Kunsch (1989).
We will present results of a simulation study and describe a consistency property in the infill asymptotics regime for certain parameters in a particular Gaussian random field model in 1D. We apply the procedure to an astronomy data set. Time permitting, we will show some ongoing work on using this resampling framework with a non-stationary point process, applying it to a rainforest data set.
References:
Kunsch (1989) - The jackknife and the bootstrap for general stationary observations. Annals of Statistics 17, 1217-1241.
Thursday November 1st, 2007
Speaker:
Kevin J. Grimm (Psychology, UC Davis)
Title:
Multitrait-Multimethod Models for Developmental Research
Abstract:
In this talk, I will present some of my most recent work extending models for longitudinal data analysis to investigate the development as well as the prediction of changes in children’s behavior through elementary school. Specifically, multitrait-multimethod (MTMM) confirmatory factor models and latent growth curves were combined to examine propositions of trait and method stability in mother, father, and teacher-reported checklists of child behaviors. Data from the NICHD Study of Early Child Care and Youth Development (SECCYD) were analyzed. In the SECCYD the Child Behavior Checklist, an informant based questionnaires of children’s behavior, were completed by the study child’s mother, father, and teacher in first, third, fourth, and fifth grades. The behavior rating scales were shown to be influenced by the traits (i.e., aggression, delinquency, depression, withdrawn) as well as the informant who completed the form. Longitudinally, the between-person differences in the traits and methods were found to be stable through elementary school. The longitudinal MTMM model was then combined with second-order latent curve models to evaluate change and then child (e.g., gender) and paternal (e.g., depression) characteristics were included as predictors of trait change and informant-based variance. The covariates were shown to be related to both traits and informants suggesting a more accurate prediction of the influences on children’s development and informant-based bias in the CBCL. Methodological extensions of longitudinal MTMM models and benefits of an MTMM approach to developmental research are discussed.
In this talk, I will present some of my most recent work extending models for longitudinal data analysis to investigate the development as well as the prediction of changes in children’s behavior through elementary school. Specifically, multitrait-multimethod (MTMM) confirmatory factor models and latent growth curves were combined to examine propositions of trait and method stability in mother, father, and teacher-reported checklists of child behaviors. Data from the NICHD Study of Early Child Care and Youth Development (SECCYD) were analyzed. In the SECCYD the Child Behavior Checklist, an informant based questionnaires of children’s behavior, were completed by the study child’s mother, father, and teacher in first, third, fourth, and fifth grades. The behavior rating scales were shown to be influenced by the traits (i.e., aggression, delinquency, depression, withdrawn) as well as the informant who completed the form. Longitudinally, the between-person differences in the traits and methods were found to be stable through elementary school. The longitudinal MTMM model was then combined with second-order latent curve models to evaluate change and then child (e.g., gender) and paternal (e.g., depression) characteristics were included as predictors of trait change and informant-based variance. The covariates were shown to be related to both traits and informants suggesting a more accurate prediction of the influences on children’s development and informant-based bias in the CBCL. Methodological extensions of longitudinal MTMM models and benefits of an MTMM approach to developmental research are discussed.
References:
Huseby, A. B. (1984). A unified theory of domination and signed domination with application to exact reliability computations. Statistical Research Report, University of Oslo (3).
Huseby, A. B. (1989). Domination theory and the Crapo _-invariant. Networks (19), 135–149.
Huseby, A. B. (2001). On regularity, amenability and optimal factoring strategies for reliability computations. Statistical Research Report, University of Oslo (4).
Satyanarayana, A. and M. K. Chang (1983). Network reliability and the factoring theorem. Networks (13), 107–120.
Satyanarayana, A. and A. Prabhakar (1978). New topological formula and rapid algorithm for reliability analysis of complex networks. IEEE Trans. on Reliability, 82–100.
Thursday September 27th, 2007
Speaker:
Jane-Ling Wang (Dept. Statistics, UC Davis)
Title:
Joint Modeling of Longitudinal and Survival Data
Abstract:
In clinical studies and experimental aging research, it has become increasingly common to observe an event time of interest, usually referred to as a survival time, along with baseline and longitudinal covariates. Both the survival and covariate processes are of interest, as is the relationship between them. Due to several complications, traditional approaches, including the partial likelihood approach for the Cox proportional hazards model and the rank based approach for the accelerated failure time model, encounter difficulties when longitudinal covariates are involved in the modeling of survival times.
Moreover, the longitudinal processes are often subject to informative dropout. Jointly modeling the survival and longitudinal data emerges as an effective way to overcome these difficulties.
In this talk, we will discuss the challenges in this area and provide several solutions. One of the difficulties is that maximum likelihood estimates (MLE) often do not exist when the survival component is modeled semiparametrically as in Cox or accelerated failure time models. Several alternatives will be illustrated, including nonparametric MLEs, the method of sieves, and pseudo-likelihood approaches. Another difficulty is related to the parametric modeling of the longitudinal component. Nonparametric alternatives will be considered to deal with this complication.
The talk is based on various joint work with Jimin Ding, Fushing Hsieh and Yi-Kuan Tseng.