# Stats Stack Exchange

I have read the term "machine intelligence" in a few places, e.g. https://web.archive.org/web/20170219022131/https://research.google.com/pubs/MachineIntelligence.html: Research at Google is at the forefront of innovation in Machine Intelligence, with...
From: Stats Stack Exchange | By: Franck Dernoncourt | Sunday, February 19, 2017
I have seen people have put a lot of efforts on SVM and Kernels, and they look pretty interesting as a starter in Machine Learning. But if we expect that almost-always we could find outperforming solution in terms of (deep) Neural Network, what is the...
From: Stats Stack Exchange | By: Hyoseung Robin Kang | Monday, February 20, 2017
I have a data set with the start and end date of different financial products, such as accounts, loans, debet and credit cards, etc. I am tying to predict purchase of a given product. This data allows me to make variable such as time since purchace of...
From: Stats Stack Exchange | By: Ullsokk | Monday, February 20, 2017
Independence of observations is an assumption in Logistic Regression Models. My questions are three: 1) How can we check whether the condition of independence holds in a Sample? 2) Why Logistic Regression requires independence of observations in the...
From: Stats Stack Exchange | By: im7 | Sunday, February 19, 2017
Context $35$% of the students who took the first semester of the Computer Technologist, exonerated the subject MDyL1. It is considered a sample of $10$ students of that semester and the random variable $X$: number of students that have the subject MDyL1...
From: Stats Stack Exchange | By: emi | Tuesday, February 21, 2017
I did a logistic regression with bayesglm from package arm. I got different results depending on the order of the variables in the model: set.seed(334) x1 <- rnorm(100) # some continuous variables x2 <- rnorm(100) z <- 1 + 0.6*x1 + 3*x2 # linear...
From: Stats Stack Exchange | By: giordano | Tuesday, February 21, 2017
How can I interpret this when there is a lot of data? I can't see any pattern...
From: Stats Stack Exchange | By: Mr. Liu | Tuesday, February 21, 2017
86, 82, 87, 85, 89, 86, 85, 83, 87, 82, 80, 79, 84 This is a list of scores for a year. Each number is a 3 month rolling avg of that month's number and the two months previous. Is it possible to calculate or estimate individual inputs for each month?...
From: Stats Stack Exchange | By: Michael Liversedge | Tuesday, February 21, 2017
I am currently reading into online convex optimization, can someone please explain to me what exactly is a leader in the Follow-The-Leader algorithm and its variants? Why is it called Follow-The-Leader? Much appreciated!...
From: Stats Stack Exchange | By: Machine Learning is not God | Tuesday, February 21, 2017
Suppose I have 4 massive data sets representing specific products all belonging to the same "product class". A priori we expect that these will all behave similarly and that a grouping will make sense. Our task is to predict a binary response associated...
From: Stats Stack Exchange | By: Lepidopterist | Tuesday, February 21, 2017
I have three random variables $X_1, X_2, X_3$ that are normally distributed each with different means $\mu_1, \mu_2, \mu_3$ and share the same variance $\sigma^2$. I want to find the covariance of $(X_1-\bar{X}, \bar{X})$, where $\bar{X}$ is the average...
From: Stats Stack Exchange | By: grayQuant | Tuesday, February 21, 2017
What is the derivative of $F = \|X^T-S^TAX^T\|_F^2$ w.r.t $A$, where $X \in\mathbb R^{d \times N}$, $S \in\mathbb R^{k \times N}$, and $A \in \mathbb{R}^{k \times N}$? I have tried, and it is as follows: $$\frac{\partial{F}}{\partial A} = -(X^T-S^TAX^T)SX$$...
From: Stats Stack Exchange | By: Elyor | Tuesday, February 21, 2017
I am running Quasi-Poisson generalized linear models (overdispersed data) on my count data, and plotting data and the fitted model with ggplot. My data with DPUT: structure(list(YR = c(1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970,...
From: Stats Stack Exchange | By: Dag | Tuesday, February 21, 2017
When running the data.frame command in R (shown below - please note that "Macro" is my variable of interest within the model), I get an output for my variable, fit, se, lower, and upper. I am aware of what each output is telling me except fit. >data.frame(effect(c("Macro"),model))...
From: Stats Stack Exchange | By: Megan Novak | Tuesday, February 21, 2017
I need to find the factors which are most impacting the number of students failed in a class. So, I have a dataset containing number of students failed and variables which are specific to the school, class etc. Essentially, I want to figure out the variables...
From: Stats Stack Exchange | By: user69769 | Tuesday, February 21, 2017
Lee and Lemieux (p. 31, 2009) suggest the researcher to also present graphs while doing Regression discontinuity design analysis. They suggest the following procedure: "...for some bandwidth $h$, and for some number of bins $K_0$ and $K_1$ to the left...
From: Stats Stack Exchange | By: bachelor | Tuesday, February 21, 2017
I have been reading up a bit on LSTM's and their use for time series and its been interesting but difficult at the same time. One thing I have had difficulties with understanding is the approach to adding additional features to what is already a list...
From: Stats Stack Exchange | By: Ravash Jalil | Tuesday, February 21, 2017
If a linear regression model is severely overfitted, the calculated prediction interval can be pretty tight, but the model won't generalize well to unseen data, so the prediction interval can be totally wrong in this case, is this correct?
From: Stats Stack Exchange | By: hooyeh | Tuesday, February 21, 2017
I am trying to build a very simple NN to approximate a linear function (literally). I took a table data: f(x) = 5 * x Shapes: Now I am building a very simple NN using Keras: from keras.models import Sequential from keras.layers import Dense from keras.layers.core...
From: Stats Stack Exchange | By: FF0_EAX | Tuesday, February 21, 2017
I am currently reading up on some hidden markov models [Source], and trying to understand some of the solution related to the evaluation problem, which is Given observation sequence $\boldsymbol{O}$ = {$o_1$,$o_2$,$o_3$...,$o_t$}, and model $\lambda$...
From: Stats Stack Exchange | By: J.Down | Tuesday, February 21, 2017
There is a great deal of information on how unbalanced data sets may impact predictive accuracy in classification problems. Several solutions have been proposed (see here). My questions are: Can a highly skewed target distribution (i.e. when the response...
From: Stats Stack Exchange | By: newbie | Tuesday, February 21, 2017
I need a help. I just got a quest to find out (at work) whether a time series is or is not a negative binomial distribution. Unfortunatelly, I cannot code. I can use Mathematica. Even though I read all the Mathematica guides, I still don't know how to...
From: Stats Stack Exchange | By: Marek N | Tuesday, February 21, 2017
I'm trying to get the same GARCH (1,1) on both fGARCH and rugarch packages but the 'sigma' series I get from both seems to be very different. The code I use is below. How can I set up rugarch to be exactly like fGARCH? fGARCH: model2=garchFit(formula...
From: Stats Stack Exchange | By: R1989 | Tuesday, February 21, 2017
I have been successfully using Pymc 2.0 for inference in a few fast models. However, I'm trying to set up my first real case inference. I'm using a relatively "heavy" likelihood (it takes ~2 min per evaluation). I noticed that the number of samples specified...
From: Stats Stack Exchange | By: Argantonio65 | Sunday, February 19, 2017
I have a problem, that sounds very simple in theory but I fail to implement a good solution. Let my data be a continuous variable that follows a normal distribution (m1,v1). I want to get a subsample of this data that follows a different normal distribution...
From: Stats Stack Exchange | By: z_fly_away | Tuesday, February 21, 2017
Beta distribution appears under two parametrizations (or here) $$f(x) \propto x^{\alpha} (1-x)^{\beta} \tag{1}$$ or the one that seems to be used more commonly $$f(x) \propto x^{\alpha-1} (1-x)^{\beta-1} \tag{2}$$ But why exactly is there "$-1$"...
From: Stats Stack Exchange | By: Tim | Monday, February 20, 2017
I'm a complete beginner with regression analysis, so this question will probably seem really silly to you. But, I simply need to check multi-collinearity among dependent variables before I start modelling. I know, that when variables are multi-collinear,...
From: Stats Stack Exchange | By: Eenoku | Tuesday, February 21, 2017
The model is $$y_{it} = \delta_0d2_t + \delta_1 crm_{it} + (\alpha_i+u_{it})$$ Here the intercept is placed in the error term, therefore if $\alpha_i$ is correlated with an independent variable would cause bias problems. My question is what are the consequences...
From: Stats Stack Exchange | By: krszyoscezio | Tuesday, February 21, 2017
Is it possible to find the Fisher Information matrix of the two parameter (scale and location) exponential distribution? Any hint?
From: Stats Stack Exchange | By: Sibil prasanth | Tuesday, February 21, 2017
I want to find the (sign of) only the first principal component PC1 of a large number n of data points x with a high dimension p. Assume the data has been centered about zero. Data points are stacked into a matrix X with n rows and p columns. We could...
From: Stats Stack Exchange | By: DragonLord | Tuesday, February 21, 2017
I want to use meta-analysis on a number of variables, including percentages (like percent nitrogen) and ratios (like ratio of carbon to nitrogen). I'm estimating mean effect sizes using meta-analysis (fixed effects) of log response ratios following Hedges...
From: Stats Stack Exchange | By: user2860703 | Tuesday, February 21, 2017
I have a dataset that contains 99.95% 0's and 0.05% 1's as the target. The dataset contains million rows. I want to build a binary classification model that predicts almost all the 1's correctly while keeping the false positives at minimum. I have read...
From: Stats Stack Exchange | By: Aman | Tuesday, February 21, 2017
Has anyone stacked Xgboost models with other models in R? Can someone point me towards example codes? Thanks in Advance
From: Stats Stack Exchange | By: adamcamroon | Tuesday, February 21, 2017
Say I have a stream of values $\langle s_1, s_2,\ldots\rangle$ coming in and a function $$E_{s_1:s_n}(t) = E_{s_1:s_{n-1}}(t-1) + \alpha\cdot (s_t-E_{s_1:s_{n-1}}(t-1))$$ that compute their exponential moving average as the values flow in. I would like...
From: Stats Stack Exchange | By: Ron | Monday, February 20, 2017
This source on feature visualization asserts that the input vector $X$, such that $\|X\|_2 \leq 1$, which maximally activates a sparse autoencoder's $i$th hidden unit is of the form $$X_j = \frac{W^{(1)}_{ij}}{\sqrt{\sum_{j=1} (W^{(1)}_{ij})^2}},$$ where...
From: Stats Stack Exchange | By: AlexMayle | Tuesday, February 21, 2017
Suppose we collect 5 phones from 5 students in a class and redistribute them at random. What is the probability that n students get their own phones back? I believe the answer to this question can be found for n = 1, for example, by fixing one correct...
From: Stats Stack Exchange | By: user11629 | Tuesday, February 21, 2017
I have a dataset containing approx 100 features. (10 factors and rest numeric) It is a classification problem with very high class imbalance. What model would you recommend based on your experience and why?
From: Stats Stack Exchange | By: Aman | Tuesday, February 21, 2017
I am trying to estimate ARMA-GARCH model for my stock returns time series. I have estimated arma model for my series, and found that there exists ARCH, so added GARCH(1,1) term. However I now find previously significant ARMA coefficients being insignificant....
From: Stats Stack Exchange | By: R.Lee | Tuesday, February 21, 2017
From: Interpreting Residual and Null Deviance in GLM R we see that the residual and null deviances are calculated from the baseline log-likelihood value given by the saturated model. It also notes that the smaller these deviances are, the better the...
From: Stats Stack Exchange | By: Alex | Tuesday, February 21, 2017
I have a large amount of data that was collected by "convenient sampling," (i.e. it is not "nationally representative."). As such, there may be sampling bias in the statistics calculated from these data. Could I subsample it so that the sample distribution...
From: Stats Stack Exchange | By: user310374 | Tuesday, February 21, 2017
Mixture models lend themselves nicely to HMM (hidden Markov model) treatment. Obviously, HMM can be unimodal (nonmixture) when that is a primary result or the superposition of self-similar states. What I am asking for is whether or not there are any...
From: Stats Stack Exchange | By: Carl | Monday, February 20, 2017
My sample size is 550, and I build a regression model with 80 features. In particular, I am using LASSO and ridge regression with 5-fold leave one-out cross-validation. To evaluate the models, I am using correlation (between the predicted and actual...
From: Stats Stack Exchange | By: renakre | Monday, February 20, 2017
How would I fix my plot so that say lower GDP is more black and higher GDP is more red? (I'm in an introduction Data Lit & Vis class, fairly new to R) data <-read.csv("rosling.csv") head(data) tail(data) median(data$pcGDP) median(data$life.expectancy)...
From: Stats Stack Exchange | By: chucK | Monday, February 20, 2017
I have this huge data set with like 2500 variables and like 142 observations. I want to run a correlation between Variable X and the rest of the variables. But for many columns, there are entries missing. I tried to do this in R using "pairwise-complete"...
From: Stats Stack Exchange | By: Stan Shunpike | Monday, February 20, 2017
Ok, so here is a statistical noob question on this topic: When extracting effects and CIs with the effects package, the CIs often overlap although the main groups are significantly different. Recently however, I got a paper rejected withthe motivation...
From: Stats Stack Exchange | By: Martin | Monday, February 20, 2017
What are the mathematical steps to get loadings and scores matrices of a 3x3 matrix basing of PCA and what is the relationship relating eigenvalues eigenvectors with loadings and score?
From: Stats Stack Exchange | By: t.hicham | Monday, February 20, 2017
Suppose I have a small 32x3 (nxp) data subset, $X_{n,p}$, approximately normal distributed after OGK based outlier reduction. Given two $\mu_1, \Sigma_1$ and $\mu_2, \Sigma_2$ estimations (i.e: candidate solutions coming from an optimization routine,...
From: Stats Stack Exchange | By: iwein | Monday, February 20, 2017
Stata allows the estimation of hazard models in the presence of competing risks, using the command stcrreg. Although the command does provide options for diagnostics, they are not well-documented. In particular, I am interested in what Stata calls "efficient...
From: Stats Stack Exchange | By: Bram H | Monday, February 20, 2017
I have a problem where I am aware that data is well-modeled by a Zeta distribution such as $P(X=x) = x^{-a}/\zeta(a)$, and would like to learn the Zeta distribution parameter $a$ from the data. More explicitely, I would like to turn the knowledge of...
From: Stats Stack Exchange | By: famargar | Monday, February 20, 2017
Here's my situation: Response variable: 0/1 Alive/Dead Bernoulli data Predictor 1: Categorical (Brand of drug) Predictor 2: Continuous (Dose of drug) I want to run a logistic regression to test whether there's an effect of Predictor 1 on the response....
From: Stats Stack Exchange | By: RegalPlatypus | Monday, February 20, 2017
