Serendeputy - your personal news assistant.

Welcome to Serendeputy!

Serendeputy is your personal news assistant.

Your deputy:
- learns what you like and don't like,
- lovingly compiles a list of news and blogs for you.

You can help your deputy learn by searching, clicking links and pressing the little smiley faces.
How it works.

What to do:
  1. Click links to teach your deputy
  2. Click smileys and frownies
  3. Find favorite topics and sources
  4. See how much better your deputy is getting at finding you good stuff.
  5. Sign in for free to save your profile, or please tell me why you won't.
I am doing an exercise of machine learning, and I have built a Gaussian Naive Bayes classifier (i.e., I have defined values of mean and standard deviation) using scikit-learn. Now I am supposed to "compute the error rate using k-fold validation based...
From: Stats Stack Exchange | By: badnack | Friday, April 17, 2015
smile
frown
I would like to conclude on a given time series that if it has Trend or not. I have carried out a cox-stuart test in R and have decomposed to inspect the series visually but still a bit confused on if there is evidence of trend or not. The command used...
From: Stats Stack Exchange | By: user42571 | Monday, April 20, 2015
smile
frown
I got a bit confused during the end of this proof so I am asking for a check. Take $$Y(n) = \begin{cases} 1 &\mbox{with probability} \ 1 -p_n \\ n & \mbox{with probability} \ p_n \end{cases} $$ Assume $p_n \rightarrow 0$ prove that $Y_n$ converges...
From: Stats Stack Exchange | By: Monolite | Sunday, April 19, 2015
smile
frown
I have a set of 10 variables: 9 explanatory, 1 response. I wish to do a constrained regression on the variables and use the values of the coefficients as weights in a TOPSIS analysis. I am having several issues with this - but I think the biggest one...
From: Stats Stack Exchange | By: TheBean | Sunday, April 19, 2015
smile
frown
What is the difference between the variance inflation factor (VIF) and stepwise regression as both help in detecting multicollinearity? What variables are different while running both techniques?
From: Stats Stack Exchange | By: neha | Sunday, April 19, 2015
smile
frown
I originally asked this question in Overflow but someone suggested I post it here. I'm trying to model the number of parks in a neighborhood as a function of education, land area (both continuous variables), and poverty percentage (categorical). There...
From: Stats Stack Exchange | By: user3642531 | Sunday, April 19, 2015
smile
frown
My professor mentioned using the invnorm function on my calculator, but many websites say that you need a mean and standard deviation to figure out the answer. Anybody have an idea of where I can begin? Thank you for your help.
From: Stats Stack Exchange | By: Lynda Strasser-Schweitzer | Sunday, April 19, 2015
smile
frown
For strictly educational purposes, our fictitious high school utilizes a GPA grading system represented by ordinal variable ranging from 0 to 4 (5 potential inputs) and the board test submitted for college admissions is ranked from 0 to 10 (11 potential...
From: Stats Stack Exchange | By: Ioannis Tikas | Sunday, April 19, 2015
smile
frown
I am having difficulties specifying the appropriate structure for nested/random effects in a mixed model that I am trying to pass through the 'Lasso' shrinkage algorithm. I am using the package 'glmmLasso'. My data consists of disease incidence data...
From: Stats Stack Exchange | By: johnybinwv | Sunday, April 19, 2015
smile
frown
Given that our loss function is $\alpha$ strongly convex function we know the Online gradient descent algorithm can get $C\log(T))/\alpha$ for some const C. My question is what happens when $\alpha$ is close to zero( in particular much smaller then 1)...
From: Stats Stack Exchange | By: Raba Poco | Sunday, April 19, 2015
smile
frown
I am trying to predict density function using LOESS in R. However, the predicted values I got are not in the estimated LOESS line. #Generate data n<-10000 a1<-a2<-0.1 a3<-a4<-0.2 a0<-0.1 u1 <-rnorm(n,0,1) u2 <-runif(n,0,1) u3...
From: Stats Stack Exchange | By: user37180 | Sunday, April 19, 2015
smile
frown
I am going through a model selection process with a mixed-model with 3 variables: A, B, and C. B and C are orthogonal factors. B or C may interact with A, so my full model would be: fixed: Y ~ A + B + C + A*B + A*C random: ~1|D When I run my analysis,...
From: Stats Stack Exchange | By: user14241 | Sunday, April 19, 2015
smile
frown
Suppose I am using random forests where the classes are highly unbalanced. How do you detect over fitting and what can you do to avoid it? Breiman says in his paper that random forests do not overfit, but others say that they can? If overfitting does...
From: Stats Stack Exchange | By: lord12 | Saturday, April 18, 2015
smile
frown
I have reading on the Johansen co integration model. I am using the jci test function of MATLAB. I have some a little trouble selecting the 'model.' can somebody explain in layman terms which and what is appropriate where? I can't understand what having...
From: Stats Stack Exchange | By: cryptex | Sunday, April 19, 2015
smile
frown
Suppose I have $n$ data points $x_1,\dots,x_n$, each of which is $p$-dimensional. Let $\Sigma$ be the (non-singular) population covariance of these samples. With respect to $\Sigma$, what is the most efficient way known to compute the vector of squared...
From: Stats Stack Exchange | By: Lepidopterist | Sunday, April 19, 2015
smile
frown
See this question on Math SE. Short story: I read The Elements of Statistical Learning and got frustrated when I was trying to verify some of the results, e.g., given $$\text{RSS}(\beta) = \left(\mathbf{y}-\mathbf{X}\beta\right)^{T}\left(\mathbf{y}-\mathbf{X}\beta\right)\text{,}$$...
From: Stats Stack Exchange | By: Clarinetist | Sunday, April 19, 2015
smile
frown
Linearized rate is a method summarise constant hazard function in a very simple way and defined as: total number of observed events divided by total patients-year (person-year). These rates should be reported with CIs. I was trying to calculate linearized...
From: Stats Stack Exchange | By: Rafik Margaryan | Sunday, April 19, 2015
smile
frown
I try to implement my own cross correlation function in R by translating it as a convolution problem. Part I: So I have two arrays, e.g. two identical arrays, and I want to get the cross correlation in R, then I need the following code?!: a1 = 1:9 a2...
From: Stats Stack Exchange | By: PeteChro | Saturday, April 18, 2015
smile
frown
I keep getting the error when I try to run this simulation could somebody tell me what I'm possibly doing wrong? P <-matrix(c(0.2,0.8,0.3,0.7,0.5,0.5), nrow=3,byrow=T) results <- numeric(1000) set.seed(87654321) for (i in c(20,30,40)){ y <-...
From: Stats Stack Exchange | By: odb | Sunday, April 19, 2015
smile
frown
I have a least square fitting like this: fit = lsfit(log10(M), log10(RS), wt) This function lists statistics and p-values for the coefficient considering the null hypothesis is zero but I want to change the null hypothesis of the coefficient from 0 to...
From: Stats Stack Exchange | By: Fred | Saturday, April 18, 2015
smile
frown
I am trying to build a second-order Markov Chain model, now I am try to find transition matrix from the following data. dat<-data.frame(replicate(20,sample(c("A", "B", "C","D"), size = 100, replace=TRUE))) Now I know how to fit the first order Markov...
From: Stats Stack Exchange | By: simonyy | Sunday, April 19, 2015
smile
frown
i need a corpus to try mahout classification, i've tried the AG's corpus of news articles downloaded from this site http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html but that was not enough for me coz it has just a small articles help plz...
From: Stats Stack Exchange | By: Ben Youb | Sunday, April 19, 2015
smile
frown
I have this table for chi-square in R: x <- matrix(c(23,22,10,14,11,12),ncol=3) phi(x) this is a 3 by 2 table and thus phi correlation won't work here, can anyone help me get a code for phi correlation (or something similar) in R for multiple cells?...
From: Stats Stack Exchange | By: jbest | Sunday, April 19, 2015
smile
frown
What is a concrete example of a Bayesian resolution to the Two Envelopes Problem?
From: Stats Stack Exchange | By: Garrett | Sunday, April 19, 2015
smile
frown
As far as I understood it, chi-square provides a measure for determining the similarity of the expected and observed (empirical and theoretical) distributions of nominal variables. It can be employed, for example, in a goodness-of-fit-test that enables...
From: Stats Stack Exchange | By: wehnsdaefflae | Friday, April 17, 2015
smile
frown
I am kind of new to statistics. I have 4 independent variables that has been observed from a system in 4 different configurations. At this point, I don't know what are the best statistical functions that can make some best comparison between them. I...
From: Stats Stack Exchange | By: lonesome | Sunday, April 19, 2015
smile
frown
I have to plot a few different simple linear models on a chart, the main point being to comment on them. I have no data for the models. I can't get R to create a plot with appropriate axes, i.e. I can't get the range of the axes correct. I think I'd...
From: Stats Stack Exchange | By: briantreg | Sunday, April 19, 2015
smile
frown
Hi there, I am trying to perform a visual analysis of significance on these stats. Other information provided is the Standardized motor skills test score [M = 100, SD = 15], not sure if this is relevant. What I can see is that the difference between...
From: Stats Stack Exchange | By: Tracy | Sunday, April 19, 2015
smile
frown
I have a dataset X, and i'm trying to predict the response variables: a, b, c given an instance x. Typically, one might run whatever regression routine on a, b, and c seperately. However, what happens if a, b, and c are closely related? For example,...
From: Stats Stack Exchange | By: user1858363 | Sunday, April 19, 2015
smile
frown
I'm looking at a Computer Vision application where I try to analyze the strength of edges a certain set of colors make with another color. For, this I take images of two colors falling on top of each other and record the edge strength (through a gradient)...
From: Stats Stack Exchange | By: dev_nut | Saturday, April 18, 2015
smile
frown
Bookmakers quite often price players to score a goal at any point during the game. For example, they may give Ronaldo a 52% chance of scoring a goal in a game, and Messi a 60% chance of scoring a goal in a game. However, how do you work out the possibility...
From: Stats Stack Exchange | By: Odds Help | Sunday, April 19, 2015
smile
frown
This question was taken from a practice exam in my statistics course. Given a random sample $X_1, X_2, ... X_n$ from a Poisson distribution with mean $\lambda$, can you show that $\bar{X}$ is consistent for $\lambda$? We are told to use Tchebysheff's...
From: Stats Stack Exchange | By: Nicky_Ay | Sunday, April 19, 2015
smile
frown
For a sequence $X_1, X_2, \dots $, Let $F_n(x)$ denote the cdf of $X_n$. Suppose our sequence is $X_n \sim N(0,n) $ then for all $x$ the point-wise limit of $F_n(x)$ is $\frac{1}{2}$. How would one prove this?
From: Stats Stack Exchange | By: Monolite | Saturday, April 18, 2015
smile
frown
Is the process of calculating Residual Standard Error in Training Set and Test Set same?
From: Stats Stack Exchange | By: caroline | Saturday, April 18, 2015
smile
frown
I came across an old exam question as follows: If the life of one computer component (in years) has Gamma distribution with mean $6$ and variance $18$, how can we find the probability that this component has a lifetime of at least $9$ years? What is...
From: Stats Stack Exchange | By: Dr. Hoshang | Saturday, April 18, 2015
smile
frown
I've implemented the Bayesian Probabilistic Matrix Factorization algorithm using pymc3 in Python. I also implemented it's precursor, Probabilistic Matrix Factorization (PMF). See my previous question for a reference to the data used here. I'm having...
From: Stats Stack Exchange | By: Mack | Saturday, April 18, 2015
smile
frown
I am developing an artificial model to simulate the growth of two types of biological cells under different conditions. The data I obtained from my model takes the form of two data-sets representing the number of cells in the culture across time in what...
From: Stats Stack Exchange | By: max0005 | Saturday, April 18, 2015
smile
frown
1) I think one of the algorithms used to handle ties for the Wilcoxon rank-sum test (a.k.a. Mann-Whitney U test) is Streitberg / Rohmel. I could not find a good source which explains the algorithm/ gives a proof/ or even simply outlines the algorithm....
From: Stats Stack Exchange | By: a.e. | Saturday, April 18, 2015
smile
frown
I have performed an ordinal probit model in STATA and have 2 queries. kindly help The parallel line assumption test (run by oparallel or brant) does not runs. and it give the error that test is only for logit models. is it so? fitsat(test for goodness...
From: Stats Stack Exchange | By: numra | Saturday, April 18, 2015
smile
frown
I am trying to implement a Neural Network to identify a Nonlinear System. I have implemented a very simple system in simulink and on the basis of examples of its input and output I would like to have the NN to mimic its behaviour. The system is the following...
From: Stats Stack Exchange | By: MagoNick | Saturday, April 18, 2015
smile
frown
I am a bit new to the whole nonparametric and Bayesian idea, so tell me if this is correct: to estimate, say, the mean of a dataset's population we do the following: We define a function $f(x)$ that is the PDF of our prior assumption of the distribution...
From: Stats Stack Exchange | By: Simon Kuang | Saturday, April 18, 2015
smile
frown
I asked this question in Stack Overflow: http://stackoverflow.com/questions/29710525/symbol-in-r-lm I feel like here would be a better place to get an answer. What exactly does the ^ symbol do to the regression and why does it make the r^2 so much higher?...
From: Stats Stack Exchange | By: japem | Saturday, April 18, 2015
smile
frown
I have three groups A, B, C, with participant ns of 20, 89, and 165. Each participant ranked her or his concern with 14 items (potential impediments to success). Scale was 0-1-2-3, 3 = most concern. I have the mean rank of each item for each group, and...
From: Stats Stack Exchange | By: Doug A.C. | Saturday, April 18, 2015
smile
frown
Following are acf and pacf plots of a monthly data series. The second plot is acf with ci.type='ma': The persistence of high values in acf plot probably represent a long term positive trend. The question is if this represent seasonal variation? I tried...
From: Stats Stack Exchange | By: rnso | Saturday, April 18, 2015
smile
frown
I'm examining a code in C++ for a nonlinear fit. It is basically a Levenberg Marquardt routine you can find on Netlib or elsewhere. The last step is estimating the errors of the parameters that are fitted. From literature, I know that the variance of...
From: Stats Stack Exchange | By: Clemens | Saturday, April 18, 2015
smile
frown
I am using R2jags to fit a model in R using JAGS. Here is my code: predictorNames <- c("BMIX", "AGE", "TEXPWK", "FRUITS", "VEGTABLS", "FISH", "REDMEAT", "POULTRY", "SOY", "NUTS", "GRAINS", "WHLGRNS", "MILKS", "DAIRY", "RACE.BLACK", "REGION.NE", "REGION.MW",...
From: Stats Stack Exchange | By: user3821273 | Saturday, April 18, 2015
smile
frown
I am studying an experiment of the kind: Let $n_{ij}$ be the number of fetuses, $X_{ij}$ the number of responses i.e. the number of fetuses with a malformation in the jth litter of the ith dose level for j=1,...,25 and i=1,...,5 . Then, $p_{ij}$ is the...
From: Stats Stack Exchange | By: CrishaD | Saturday, April 18, 2015
smile
frown
I took a test two days ago. one of our question is as follows: decision tree with depth 2 is constructed for two binary feature. hypothesis spase that can be shown with the following tree has how many features? The answer sheet say solution as $16$ but...
From: Stats Stack Exchange | By: Anjela Dark | Saturday, April 18, 2015
smile
frown
I have a population of n unique items and am taking a sample of r. I am sampling with replacement. I would like to calculate the probability of sampling any specific item x times give the sample size and population.
From: Stats Stack Exchange | By: David | Saturday, April 18, 2015
smile
frown
I have the numeric values to plot a probability density function....they look like 0.000390911, 0.00039091099989183763, 0.0003909109997836753, 0.0003909109996755129, 0.0003909109995673506, 0.0003909109994591882, 10.398579783795636, 10.398469842516338,...
From: Stats Stack Exchange | By: triub | Friday, April 17, 2015
smile
frown