I'm a novice in machine learning. I'm trying to make prediction with classification methods. My class has 3 possibles states, so 33% of probabilities at the beginning. I can't go further than 45% of accuracy. To imporve my accuracy, I changed my features....

From: Stats Stack Exchange | By: Yoann boyere | Sunday, March 26, 2017

I've read many threads on this website try to understand why we need to break the data on hand into 3 parts, the training, validation and test data set. I am still thinking it is enough just to break the data set on into into 2 pieces, i.e., the training...

From: Stats Stack Exchange | By: KevinKim | Sunday, March 26, 2017

The article plots for every 100 women that use a certain type of contraception method the number of unplanned pregnancies over time. https://www.nytimes.com/interactive/2014/09/14/sunday-review/unplanned-pregnancies.html?_r=0 In particular at the end...

From: Stats Stack Exchange | By: user103341 | Sunday, March 26, 2017

I'm provided with the following dataset: Dataset. I'm meant to use sklearn to create a Support Vector Machine that can predict it. I load A and B from my dataset into a 2 dimensional array called input_data and load the label from my dataset into an...

From: Stats Stack Exchange | By: patrickdamery | Wednesday, March 29, 2017

I have devised a new clustering algorthm that is domain agnostic and has several assumptions. I can't seem to find real-world data to test it, so I have generated some syntetic data. The idea: Generate pre-cluster data "Forget" the clusters and unify...

From: Stats Stack Exchange | By: Jack Stevens | Tuesday, March 28, 2017

In the sarima function in the astsa package in R, we can add external regressors to a SARIMA model, so I assume that we obtain a SARIMAX model? If we add regressors to a SARIMA(p, 0, 0) x (0, 0, 0)o model, is this equivalent to adjusting an ARX model...

From: Stats Stack Exchange | By: Xavier | Tuesday, March 28, 2017

Edit: I apologize if the question is considered to broad. In fact, it is concerning a very specific task in bioinformatic analysis of high throughput data set, and in my opinion the problem presented here, albeit not formulated formally, is very specific....

From: Stats Stack Exchange | By: January | Monday, March 27, 2017

I'm trying model stacking in a kaggle competition. However, what the competition is trying to do is irrelevant. I think my approach of doing model stacking is not correct. I have 4 different models: xgboost model with dense features (numbers, that can...

From: Stats Stack Exchange | By: user1157751 | Wednesday, March 29, 2017

I need to analyse participant's reaction time data from a 3x2x2 (Face-ABC, Visual field-1,2,Prime-?,!) repeated measures design. Each trial manipulated all IV's and multiple trials were completed per combination. Insofar as data transformation, my approach...

From: Stats Stack Exchange | By: cel12345 | Tuesday, March 28, 2017

full question
The regression between y and x gives the equation y-hat = -1.2 + 3.4x. The R-squared value for this regression is 0.64. What is the correlation value? Input your answer in decimal format, rounded to 2 decimal places.

From: Stats Stack Exchange | By: courtney.b | Wednesday, March 29, 2017

I have read the article Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. In this article in section 2.3, the theorem about equivalence of first-order incremental search for mRMR and Max-Dependency...

From: Stats Stack Exchange | By: Peter Bugata | Tuesday, March 28, 2017

Given the fact that A, B C, D are the input nodes and Quality is the output node. value of all input nodes can vary from 0 to 1....

From: Stats Stack Exchange | By: user3447215 | Wednesday, March 29, 2017

Using logistic regression (lrm) from rms, is there a way to use the Predict command to compute (and plot) predicted probabilities, not log odds?

From: Stats Stack Exchange | By: Paul gronke | Wednesday, March 29, 2017

From what I know, Recurrent NNs perform very well in case of sequential data. However, I have also read at many places that it can be used for non-sequential data as well. For instance in the article 'The unreasonable effectiveness of Recurrent Neural...

From: Stats Stack Exchange | By: darthy | Wednesday, March 29, 2017

I have to classify the executable files as malicious and non-malicious files. I have created my own corpus Train. I have explained the errors below. The input file format is also given below. How can I get the presence of the features with their names...

From: Stats Stack Exchange | By: banu | Wednesday, March 29, 2017

I start with 9 independent variables in my linear regression. However, I find that the overall F test value is not having significant P value and some variables are highly correlated. Thus I discarded five independent variables and get both significant...

From: Stats Stack Exchange | By: Eric | Tuesday, March 28, 2017

I ran an experiment with 4 factors A, B, C, and D. Factor C is nested within B. The results of the ANOVA show: C had a significant effect and AxB had a significant interaction D had no significant effect, but in this case, D is species, and I want to...

From: Stats Stack Exchange | By: Nathan Haag | Tuesday, March 28, 2017

I have a network that is an "ensemble" of text data and linear data that feed into a concat layer into another feed forward network. I understand that the gradient is giving you how much an input/node affect the output of the network. Would it be possible/usefult...

From: Stats Stack Exchange | By: Camron_Godbout | Tuesday, March 28, 2017

In the book Time Series Analysis by R, the author mentions the use of moving average to smooth out the white noise.
Can moving averages be used to remove white noise or are there better methods?

From: Stats Stack Exchange | By: jeffy abraham | Tuesday, March 28, 2017

I am looking at the following pdf and on page 4 , it mentions that If $x,y$ are jointly distributed variables which bear the linear relationship, $$E(y|x) = \alpha + B^T x$$ then $$Var(y|x) = Var(y) - Cov(y,x)[Var(x)]^{-1} Cov(x,y)$$ note that $E(y|x)...

From: Stats Stack Exchange | By: user1769197 | Tuesday, March 28, 2017

My first question was: how does the sample size affect t-test results? And I found the answer in this post. Now I understand the "unbalance" situation does not neccessarily affect the results of a t-test. I also tried power200200 and power20050 in the...

From: Stats Stack Exchange | By: Yan | Tuesday, March 28, 2017

Consider the random variables $X$ and $Y$ defined on the same probability space $(\Omega, \mathcal{F}, \mathbb{P})$ taking values respectively in $\mathbb{R}^d$ and $\mathbb{R}^p$. Let $f:\mathbb{R}^d\rightarrow \mathbb{R}^m$. Let $g:\mathbb{R}^m\rightarrow...

From: Stats Stack Exchange | By: user3285148 | Tuesday, March 28, 2017

I have an estimator $\theta$ for the mean $\mu$. I understand consistency such that $\theta$ converges in probability to $\mu$ as $n$ goes to infinity. Now, I encountered another concept, consistency in mean sqaure. $\theta$ is consistent in mean square...

From: Stats Stack Exchange | By: cecefuss | Tuesday, March 28, 2017

I have the following model $y_i=\beta_1+\beta_2x_i+\epsilon_i$ with $E(\epsilon^2)=\sigma^2\exp(x_i)$ And I have to use the proper transformation to obtain a model where the variance of the error is $\sigma^2$. My guess: I do not have any guesses because...

From: Stats Stack Exchange | By: plr | Tuesday, March 28, 2017

I am working with linear Gaussian Bayesian networks, and trying to recover the joint multivariate distribution from the conditionals. This is described in Probabilistic graphical models by Koller (pg 251). ( A summary can be found at the link: pdf download!)...

From: Stats Stack Exchange | By: user2957945 | Tuesday, March 28, 2017

I have to extract data from our databases, which I then perform various aggregation and recoding before handing to analysts. I would like to document the definition of the variables carefully so that everything is transparent. For example, I want the...

From: Stats Stack Exchange | By: Heisenberg | Tuesday, March 28, 2017

Old Scheme 57 103 59 75 84 73 35 110 44 82 67 64 78 53 41 39 80 87 73 65 28 62 49 84 63 77 67 101 91 50 New Scheme 62 122 54 82 84 86 32 104 38 107 84 85 99 39 34 58 73 53 66 78 41 71 38 95 81 58 75 94 100 68 the sales output in (£000) before and after...

From: Stats Stack Exchange | By: Ruma Sinha | Tuesday, March 28, 2017

I have been working in a project in which I have to collect the following data over a timespan: Temperature of a room Humidity inside the room Amount of CO2 present in the room Number of persons in the room I have already conducted the experiment and...

From: Stats Stack Exchange | By: somdeep acharyya | Tuesday, March 28, 2017

I am stuck on the following homework assignment: $$X'X=\begin{bmatrix} 10 & 1.2980 & -2.4641 & 0.7716 \\ 1.2980 & 4.8676 & -3.0048 & -1.6154 \\ -2.4641 & -3.0048 & 5.3561 & -0.4576 \\ 0.7716 & -1.6154 & -0.4576...

From: Stats Stack Exchange | By: user429134 | Tuesday, March 28, 2017

This is a followup to the answers here and here. I have not seen this term in any textbooks I have or many online resources. It is not, for example, present on the SVM wikipedia page. What is a hypothesis class in the context of SVM? How do the support...

From: Stats Stack Exchange | By: kingledion | Tuesday, March 28, 2017

I would like to know how to handle missing data in predictive analysis: In my case, missing information has been decided not to be omitted, however, for certain predictive models such as logistic regression, random forest, they couldn't handle missing...

From: Stats Stack Exchange | By: user95902 | Tuesday, March 28, 2017

I have some data representing times series about houses costs in specific areas. Some of the values along the times series (30 points = 30 months) are missing or are totally wrong (huge spikes). What I am doing right now is to calculate the average and...

From: Stats Stack Exchange | By: Randomize | Tuesday, March 28, 2017

I am building a recommendater system using Collaborative Filtering. I have implemented Alternating Least Square method following this tutorial. Now I want my algorithm to adapt new ratings for movies that were previously not rated. Should the algorithm...

From: Stats Stack Exchange | By: Jatin Bhola | Tuesday, March 28, 2017

Let $M$ be a n x k matrix which is the outcome of a subjective test, where $n$ is the number of samples and $k$ is the number of raters Values in $M$ range from 0 to 1. Since the number of samples is high and the evaluation procedure is long, each rater...

From: Stats Stack Exchange | By: Francesco Setragno | Monday, March 27, 2017

I have an observed data set, denote it by data set A. I simulated a multivariate normal data set, denote it as data set B. When I plot the ACF and PACF of both data sets I received a very similar result. I applied all joint test of multivariate normal...

From: Stats Stack Exchange | By: rsc05 | Tuesday, March 28, 2017

I am trying to forecast time-series in a very "applied" sense. Ideally, what I would be looking for was a time-series model (à la ARIMA models), which could capture the dynamics of the growth data of an index. However, I have strong a priori ideas about...

From: Stats Stack Exchange | By: pApaAPPApapapa | Tuesday, March 28, 2017

Here is the statement, that I have read: Since we are selecting the furthest outlier, it is not legitimate to use a simple t-test(for studendized residuals) for detecting outliers. To remedy this we can make a Bonferroni adjustment to the p-value I have...

From: Stats Stack Exchange | By: Daniel Yefimov | Tuesday, March 28, 2017

I wish to forecast Y for year 2018 but I only have two data points of Y in years 2006 and 2012. I already did multiple linear regression (since I have a lot of predictors) but multiple linear regression does not consider the time aspect so my predictions...

From: Stats Stack Exchange | By: Katherine | Tuesday, March 28, 2017

I'm new at learning random variables and stuck in this example. Can anyone help me solve this?
"The RV x is N(5,2) and y=2x+4. Find mean, standard deviation and density function of y."

From: Stats Stack Exchange | By: Kubilay Can DEMİR | Tuesday, March 28, 2017

I just estimated a Vector Autoregressive Model with 6 lags and 10 variables in R. My goal is to simulate the given original time series (on which the model parameters were estimated) to see how the model fits. As the simulation in R doesn't work (but...

From: Stats Stack Exchange | By: Blair92 | Tuesday, March 28, 2017

Let's say we have a random variable $Y$ defined as the sum of $N$ Bernoulli variables $X_i$, each with a different success probability $P_i$ and a different weight $W_i$. The weights are positive and between 1-1,000 Formally, $Y = \sum X_i W_i$ Where...

From: Stats Stack Exchange | By: Leon P | Tuesday, March 28, 2017

I would greatly appreciate if you could let me know how to do discrete time survival analysis with time varying covariates. Some part of my data set is as follows (d1-d12: are dummy variables for each time period): ID TIME EVENT x1 x2 x3 x4 x5 1 1 0...

From: Stats Stack Exchange | By: ebrahimi | Tuesday, March 28, 2017

When performing chi-squared independence tests, why do 2x2 tests always have every residual value (o-e) equal?
Why is this not true for tests with unequal amounts of rows and columns?

From: Stats Stack Exchange | By: Eric | Tuesday, March 28, 2017

I am trying to implement Feature hashing in python. I plan to use the following command. preproc =Pipeline([('fh',FeatureHasher( n_features=2**27,input_type='string', non_negative=False))]) I have a dataframe that has int64, category, object data types....

From: Stats Stack Exchange | By: Aman | Tuesday, March 28, 2017

My question is are all ARIMA processes also unit root processes? My guess is yes because $\{X_t\}$ is ARIMA(p, d, q) if $(1-B)^dX_t = a(B)\epsilon_t$ is stationary ARMA(p, q). The characteristic function for $(1-B)^dX_t = a(B)\epsilon_t$ is $(1-B)^d$,...

From: Stats Stack Exchange | By: Student | Tuesday, March 28, 2017

I need help in determining the cut-off value to make sensitivity and specificity for a biomarker. I have a continuous biomarker and an outcome with three diseased groups (ex, stage I, II and III) I know many methods such as: mean of the biomarker Bimodal...

From: Stats Stack Exchange | By: Mohamed Gomaa | Monday, March 27, 2017

I have some data and a model, e.g., $H_0: \xi \sim \mathcal N(\mu_1 > \mu_0, \sigma_0)$, $\mu_0, \sigma_0$ are fixed, in other words, this is the similar to the situation with "unequal means, alternative=greater". How can I calculate the likelihood...

From: Stats Stack Exchange | By: German Demidov | Monday, March 27, 2017

I'm working with a Negative binomial regression in STAN. I would like to make predictions on a test set, but looking at the reference I can't find a negative_binomial random number generator. Is there any way to do so without saving mean and overdispersion...

From: Stats Stack Exchange | By: Tommaso Guerrini | Monday, March 27, 2017

I'm trying to understand how loss_metric class in dlib calculates the gradient. Here is the code(full version): // It should be noted that the derivative of length(x-y) with respect // to the x vector is the unit vector (x-y)/length(x-y). If you stare...

From: Stats Stack Exchange | By: don-prog | Monday, March 27, 2017

I am learning how to use libsvm through sklearn.svm in python. I read here about what happens and why when you change the C value as part of your model. My intuition from what I've learned, would be that lower C values would use less support vectors...

From: Stats Stack Exchange | By: kingledion | Monday, March 27, 2017

