## Welcome to Serendeputy!

Serendeputy is your personal news assistant.

- learns what you like and don't like,
- lovingly compiles a list of news and blogs for you.

You can help your deputy learn by searching, clicking links and pressing the little smiley faces.
How it works.

What to do:
1. Click links to teach your deputy
2. Click smileys and frownies
3. Find favorite topics and sources
4. See how much better your deputy is getting at finding you good stuff.

# Stats Stack Exchange

Using the package "mgcv" I fitted a GAM to demonstrate head circumference changes over the lifetime for data stemming from two different countries; library("mgcv") gam1 <- gam(Size~s(Age,k=12, bs="cr")+Country1+Country2+Sex,data = sizedata, method="REML")...
From: Stats Stack Exchange | By: StevenP | Friday, November 21, 2014
smile
frown
this is my 1st question on this forum. Straight to the problem I want to test the hypothesys that two coefficents of multinomial regression model are equal between them. For example, considering a model where Y is 3 level categorical variable: A, B,...
From: Stats Stack Exchange | By: Giovanni Romeo | Friday, November 21, 2014
smile
frown
I am looking at the tenure (i.e. elapsed time between hire date and termination date) of various employees in a system. I want to test the hypothesis that two groups of employees have the same tenure distribution. This distribution is generally exponential...
From: Stats Stack Exchange | By: src | Friday, November 21, 2014
smile
frown
First I'm not sure if this is the right place to post my question, but I saw some questions about ANN, and I assumed I can ask it here. I have implemented an ANN with back-propagation. I'm using it for Wi-Fi based indoor localization using fingerprinting....
From: Stats Stack Exchange | By: Alaa | Friday, November 21, 2014
smile
frown
Let's consider the following log-linear model: $log(Y_i) = \alpha + X_i\beta + \epsilon_i$ for i = 1, ..., N The fitted value is: $\widehat{log(Y)} = \hat{\alpha} + X\hat{\beta}$ Assuming $\epsilon$ ~ $N(\mu, \sigma^2)$, we can express the fitted value...
From: Stats Stack Exchange | By: Mayou | Friday, November 21, 2014
smile
frown
I know what is SRM but I didn't understand the relation between SRM and SVMs. Can anyone explain me this? Why they say that SVMs rely on a SRM approach? Thank you so much!
From: Stats Stack Exchange | By: Kevin | Thursday, November 20, 2014
smile
frown
I have a data set with four observations consisting of the variable $Y$ measured at time $t_0=0$ and at times $t_1, t_2$ and $t_3$. I would like to fit the following model: $$\log(y_j) = \alpha + \beta \log(t_j) + \epsilon_j$$ Because of the logarithm,...
From: Stats Stack Exchange | By: user7064 | Friday, November 21, 2014
smile
frown
In linear regression I have come across a delightful result that if we fit the model $$E[Y] = \beta_1 X_1 + \beta_2 X_2 + c,$$ then, if we standardize and centre the $Y$, $X_1$ and $X_2$ data, $$R^2 = \mathrm{Cor}(Y,X_1) \beta_1 + \mathrm{Cor}(Y, X_2)... From: Stats Stack Exchange | By: Corone | Thursday, November 20, 2014 smile frown I have data in form of N sequences s_j=(t_i, e_i)_{i\in\{1,\ldots,n_j\}} with n_j data-points each, where t_i is a time-stamp and e_i is a categorial event, say e_i\in\{A,B,C,D\}. The N sequences are independent. I want to find short (i.e.,... From: Stats Stack Exchange | By: thias | Thursday, November 20, 2014 smile frown I want to estimate a multilevel logit model. But I'm confused about the minimum number of groups and observations per group. What would be the minimum number of observations per group? My case: I have a small sample over 200 observations that I can classify... From: Stats Stack Exchange | By: Tappin73 | Thursday, November 20, 2014 smile frown f_{X,Y} \left( x, y \right) = 1\quad \text{for}\quad 0≤x≤1,\ 0≤y≤1  and 0 otherwise. How to calculate P \left( |X − Y | ≤ 1/6 \right)? From: Stats Stack Exchange | By: Ghetty Noman | Friday, November 21, 2014 smile frown I have pollution data (quantitative) plotted against time (categorical), the hours of the day. Via ANOVA testing I've found significance at many of the hours, however, the relationship is definitely not linear. For example, it tends to rise in the morning,... From: Stats Stack Exchange | By: compguy24 | Thursday, November 20, 2014 smile frown I have the following data: A | B --------------- 9,794 | 10,7098 9,8022 | 10,5176 9,7055 | 11,1039 9,7091 | 11,1474 10,1882| 10,4693 10,2204| 10,8072 9,8221 | 11,2713 9,9272 | 11,2888 10,0855| 10,9026 10,108 | 10,872 10,1433| 10,9649 10,1432| 10,9805... From: Stats Stack Exchange | By: andrepcg | Thursday, November 20, 2014 smile frown I am using SVR for statistical down-scaling of precipitation.I have taken the first 3 factor scores in Principle component analysis of variables as predictors and precipitation as predictand.As precipitation in the area where i am working is less, most... From: Stats Stack Exchange | By: Naveen Reddy | Friday, November 21, 2014 smile frown There is a model expressed as y_t = h^Ty_{t-1} + b^T x_t where x is a zero mean uncorrelated white Gaussian noise of unity variance \sigma^2_x. This appears to be an ARMA model. But, I am confused about the model order for the MA part which contains... From: Stats Stack Exchange | By: Ria George | Wednesday, November 19, 2014 smile frown I am trying to figure out if it's possible to create vector velocity field from some raster containing spatial distribution of scalar variable (timestamp of some mass-media event in my case). Here is my GeoTIFF overlapped with isochrones drown from the... From: Stats Stack Exchange | By: Vitaly Isaev | Thursday, November 20, 2014 smile frown Let's say I have a segment of pipe of length X that I want to test the integrity of by way of a hydrostatic pressure test. That is, I isolate the ends of the pipe and pump water into it until it a very high pressure is attained and hold that pressure... From: Stats Stack Exchange | By: Mark B | Thursday, November 20, 2014 smile frown I have data which I suspect follows a power function over time. It is collected from several units which have different intercepts. Therefore I'd like to do a mixed model with the parameters of the power function as fixed effects and intercept as random... From: Stats Stack Exchange | By: Jonas Lindeløv | Thursday, November 20, 2014 smile frown (This question is related to a previous one I made, here) I have a set of 2D observations (measured data) of sample size N:$$O = \{(x_1, y_1), (x_2, y_2), ..., (x_N, y_N)\}$$I also have a model S(v_1, v_2, ..., v_p), which depends on p parameters,... From: Stats Stack Exchange | By: Gabriel | Thursday, November 20, 2014 smile frown The derivation of gradient of the marginal likelihood is given in this pdf, equation 5.9. But the gradient for the most commonly used covariance function, squared exponential covariance, is not explicitly given. I am implementing the Rprop algorithm... From: Stats Stack Exchange | By: Polymorpher | Thursday, November 20, 2014 smile frown I suspect that this is entirely possible since the endogenous variable coefficient can biased in many possible way, thus leading to a near 0 estimate despite having a real causal relationship. More formally, let Y be the dependent variable, X be... From: Stats Stack Exchange | By: Heisenberg | Thursday, November 20, 2014 smile frown I have a normally distributed SPSS dataset with 1400+ cases. I want to create sub-groups based on "age in months" that are distributed evenly. The dataset ranges from 12 to 60 "age in months". Weighted average percentiles are 20 mos (5%), 23 mos (10%),... From: Stats Stack Exchange | By: jam320 | Thursday, November 20, 2014 smile frown As there is no published list of population (employees), how can I select my sample? Even if I want I can not go for judgmental sampling as the academicians in my University strongly disagree for that non-probabilistic sampling methods for quantitative... From: Stats Stack Exchange | By: Shaki | Thursday, November 20, 2014 smile frown I've got a multivariate dataset (p=2) that I'm trying to calculate the W matrix for use in canonical variates analysis If each x_{kj} is the jth observational unit from the kth group, and \bar{x}_k is the mean vector for the kth group, then: W =... From: Stats Stack Exchange | By: Nick | Thursday, November 20, 2014 smile frown I obtained an ACF plot from R. Please see below: Does that mean the observations are independent? What do small autocorrelations imply? Thank you very much!... From: Stats Stack Exchange | By: Rmania | Thursday, November 20, 2014 smile frown Quote from wikipedia: In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ^*—having the property that as the number of data points used increases indefinitely,... From: Stats Stack Exchange | By: Charlie Parker | Thursday, November 20, 2014 smile frown Let x and \gamma be vectors. Here it says that$$E[y-x'\gamma]^2 = E[(y-E[y|x])^2 + (E[y|x]-x'\gamma)^2]$$However, I don't see why$$E[(y-E[y|x])(E[y|x]-x'\gamma)] = 0. By the way, $E$ is the same as $E_{x,y}$ for all purposes. I wasn't getting...
From: Stats Stack Exchange | By: Clark Kent | Thursday, November 20, 2014
smile
frown
I'm an undergraduate student. I read about multivariate normal distribution in hogg and craig. And i wonder why the covariance is allowed to be positive SEMI-definite. I read this Normal distribution with positive SEMI-definite covariance matrix And...
From: Stats Stack Exchange | By: INFH | Thursday, November 20, 2014
smile
frown
My formula looks like this: lm(formula=BearWeight ~ honey + age, data=BearData) my output looks like this I am told to interpret each estimate and then use the estimates to predict the weight of a 15 month old bear given the Low Honey treatment. Can...
From: Stats Stack Exchange | By: user3105519 | Thursday, November 20, 2014
smile
frown
Given a population (normal distribution) mean and variance and a sample set with n<30 values, how should I perform hypothesis testing. My professor told me that I should be using the t-test for this case. I believe that I understand why a t-test is...
From: Stats Stack Exchange | By: orbv12 | Thursday, November 20, 2014
smile
frown
In many of our product tests and attitudinal studies where we ask for rating on Overall Liking (a product), product managers either use a Likert scale of 7 or 9-points with each point anchored (ex. 7= Extremely like, 6=Moderately like. etc.) We now want...
From: Stats Stack Exchange | By: Luisa | Wednesday, November 19, 2014
smile
frown
For example, for p as the parameter to a binomial or bernoulli, or a Poisson, what would a flat prior p be? What does it mean to be "flat" - does this refer to diffuse?
From: Stats Stack Exchange | By: user2896468 | Wednesday, November 19, 2014
smile
frown
I'm trying to compute an estimate for the variance of the estimated coefficients in a non-linear regression using the formula described in link. I can't figure out how to build $F_{ij}$ Let's consider for simplicity a quadratic regression: $f = \beta_0... From: Stats Stack Exchange | By: user1584773 | Wednesday, November 19, 2014 smile frown I have been trying to figure out how to plot a multiple regression for a training set with the K(KNN regression). The package name is KKNN for R. The line below expresses the multiple regression model we decided best fits our dataset. We are looking... From: Stats Stack Exchange | By: user1889418 | Wednesday, November 19, 2014 smile frown Using TraMineR, is it possible to compare two sequence objects to calculate the discrepancy between them? By this I mean not comparing two sequences but two sets of sequences. Is this possible? If so, how do I go about doing it?... From: Stats Stack Exchange | By: histelheim | Wednesday, November 19, 2014 smile frown In glms we can use a quassipoison fudge factor to account for over dispersion in our poisson models. In glmms we can add an individual level random effect (e.g. id) for each row in data.frame to account for over dispersion. i.e. glmer(y ~ x + (1|group)... From: Stats Stack Exchange | By: user1322296 | Wednesday, November 19, 2014 smile frown I am looking for a way to quickly compute the central moments of a Poisson random variable. I've found a couple of resources on how to compute the central moments, but I'm still trying to figure out if there are any fast algorithms out there. Of course... From: Stats Stack Exchange | By: animal_magic | Wednesday, November 19, 2014 smile frown I am trying to understand the following example from page 191 of Nelsen's book "Introduction to copulas".$\int\int_{I^2} M(u,v)dM(u,v)=\int_0^1 u du$where$M(u,v)=min(u,v)$is Frechet-Hoeffding's upper bound. Using$dM(u,v)=\frac{\partial M}{\partial...
From: Stats Stack Exchange | By: Mike | Wednesday, November 19, 2014
smile
frown
I have paired gene expression data before and after a treatment, as well as an ordinal response variable with 3 levels for each sample after treatment. I am interested in the correlation of the (change in expression of 30 genes before and after) with...
From: Stats Stack Exchange | By: user2379487 | Wednesday, November 19, 2014
smile
frown
I'm trying to model the distribution of effects of mutations (let's call it s) in evolution but I'm stuck in generating the probability distribution function (pdf) for my model. So, my model is a mixture of 3 components. As far as I understood, if I...
From: Stats Stack Exchange | By: Diogo Santos | Wednesday, November 19, 2014
smile
frown
I am working on something and ran into a problem. Can anyone offer some assistance? Let X1, X2, ... be iid Bernoulli RV's, Xi~Bern(p) Let Sn = X1 + ... + Xn, Xbar = Sn/n Tn = sqrt(n)(Xbar - p)/sqrt(p(1-p)) Find the limiting value of lim(n->inf)M(xbar)(t)...
From: Stats Stack Exchange | By: Joel Sinofsky | Wednesday, November 19, 2014
smile
frown
In our model we are getting multicollinearity issues. But the problem is that we can't combined the variables or drop certain variables to get rid of multicollinearity. Model Structure: My Model looks like the following structure. This structure is taken...
From: Stats Stack Exchange | By: Beta | Wednesday, November 19, 2014
smile
frown
A psychologist is collecting data on the time it takes to learn a certain task. For $55$ randomly selected adult subjects, the sample mean is $10.5$ minutes and sample standard deviation is $3.25$ minutes. Construct a $99\%$ confidence interval for the...
From: Stats Stack Exchange | By: user59616 | Wednesday, November 19, 2014
smile
frown
I know the relation between the distribution of sample means (for a given sample size) and the population parameters, from which samples are taken. My question is, how exactly do we use this relation to do interval estimation? More specifically, when...
From: Stats Stack Exchange | By: Macond | Wednesday, November 19, 2014
smile
frown
I'm trying to replicate a study where the author used the McNemar test to assess the performance of classification compared to random classification. I have the original classifier and I'm using R to do the McNemar test, but I don't know how I'm supposed...
From: Stats Stack Exchange | By: Patrick Brandão | Wednesday, November 19, 2014
smile
frown
was wondering if anyone has experience with a repeated mixed effects model in R.I am working with trees that were fertilized in a full factorial design (NxPxK) in plots that are replicated four times. I currently have a mixed effects model with this...
From: Stats Stack Exchange | By: Annette | Wednesday, November 19, 2014
smile
frown
I have two groups of persons, GRP0 and GRP1, on which I measured three continuous variables: VAR1, VAR2 and VAR3. I would like to use Mancova in R with: - VAR1, VAR2 and VAR3 as outcome variables - GRP={0,1} as predictor variable - age and gender as...
From: Stats Stack Exchange | By: user49088 | Wednesday, November 19, 2014
smile
frown
I am trying to create a linear regression model containing two predictors and 1 response variable. My response variable has a short term pattern, i.e. surge during weekdays and slump during weekends and I suspect this pattern is a result of two things:...
From: Stats Stack Exchange | By: vagabond | Wednesday, November 19, 2014
smile
frown
Here is a very simple question: On country-level data, I am running two different fixed effect model, using Stata command, but I don't know why not only the estimation of standard errors, but also coefficients are different. The first regression uses...
From: Stats Stack Exchange | By: Olivier Thevenon | Wednesday, November 19, 2014
smile
frown
Since neither the package-description nor other literature I've screened so far can give me an answer I actually understand I'd like to ask you guys the following: How should I interpret/understand the arguemnt "type" in R's VAR() Function? What do those...
From: Stats Stack Exchange | By: George | Wednesday, November 19, 2014
smile
frown