Serendeputy is your personal news assistant.

Your deputy:

- learns what you like and don't like,

- lovingly compiles a list of news and blogs for you.

You can help your deputy learn by searching, clicking links and pressing the little smiley faces.

How it works.

- Click links to teach your deputy
- Click smileys and frownies
- Find favorite topics and sources
- See how much better your deputy is getting at finding you good stuff.
- Sign in for free to save your profile, or please tell me why you won't.

I'm working off my first independent project for some pattern classification. I'm utilizing some datasets from UCI machine learning, but am not sure on how to start with data normalization. The data isn't that large (feature vector around 15-20 dimensions),...

From: Stats Stack Exchange | By: mcjoness | Saturday, April 19, 2014

smile

frown

skip

Assuming a dataset with the following attributes: Date (truncated), f1 ... fn, #impressions, #goals. The problem: I want to grow $n$ trees that would find the optimal selection of features and their ranges in each, and that maximize the goal rate (goals...

From: Stats Stack Exchange | By: Haggai | Sunday, April 20, 2014

smile

frown

skip

I need to correlate employee engagement (gathered data using the 9 item UWES questionnaire) and organizational commitment (gathered data using the 18 item Organizational Commitment Scale). The both of them can be divided into different components; UWES...

From: Stats Stack Exchange | By: naugri | Sunday, April 20, 2014

smile

frown

skip

I am looking to predict groups of items that someone will purchase... i.e., I have multiple, colinear dependent variables. Rather than building 7 or so independent models to predict the probability of someone buying each of the 7 items, and then combining...

From: Stats Stack Exchange | By: blast00 | Sunday, April 20, 2014

smile

frown

skip

I am working on a project, and I am totally new to statistics. I have sales data for last two years at week level, along with other variables like temperature, holiday (TRUE/FALSE), where holiday are nominal variables. I have to do forecasting for the...

From: Stats Stack Exchange | By: Arushi | Saturday, April 19, 2014

smile

frown

skip

I am working to analyze poverty rate using census data. I have a huge dataset. I want to extract the likelihood from this dataset in order to create patterns for energy consumption. Let's say this: in a house where we have 3 members with average age...

From: Stats Stack Exchange | By: user3378649 | Sunday, April 20, 2014

smile

frown

skip

I am trying to do experiments on classifying longitudinal systems. We're working on classifying the location where we sell items most. I don't have a lot of experience in statistics and modeling data beyond a high school statistics course so I'm kinda...

From: Stats Stack Exchange | By: user3378649 | Sunday, April 20, 2014

smile

frown

skip

$X$ and $Y$ are uniformly distributed on the unit disk. Thus,
$f_{X,Y}(x,y) = \begin{cases} \frac{1}{\pi}, & \text{if} ~ x^2+y^2 \leq 1,\\
0, &\text{otherwise.}\end{cases}$
If $Z=X+Y$, find the pdf of $Z$.

From: Stats Stack Exchange | By: Someone | Sunday, April 20, 2014

smile

frown

skip

I have a question about the prediction of volatility and returns of a time series. Basically it is a question about prediction in the fGarchpackage. The following code is from the book Analysis of financial time seriesand it is an example of AR/GARCH...

From: Stats Stack Exchange | By: user8 | Sunday, April 20, 2014

smile

frown

skip

I was doing some self study and came across the following formulae for estimating standard errors: Formulae 1: Formulae 2: I understand that these two can all be used when the Population Standard Deviation is unknown. But I don't really understand why...

From: Stats Stack Exchange | By: user1275515 | Sunday, April 20, 2014

smile

frown

skip

I am looking for an introductory to intermediate level book on Generalized Linear Models. Ideally, in addition to the theory behind the models, I would want it to include applications and examples in R or another programming language - I hear SAS is...

From: Stats Stack Exchange | By: JohnK | Saturday, April 19, 2014

smile

frown

skip

Can you please provide
One advantage of "k-Means" compared to "Hierarchical Clustering"
One advantage of "Hierarchical Clustering" compared to "k-Means"
Thanks in advance !!

From: Stats Stack Exchange | By: YevgenyM | Sunday, April 20, 2014

smile

frown

skip

I am here to seek opinion on how should i represent my data that i have collected. I am to create a presentation focusing on environment. I was told that a simple bar chart and a line graph is a bad visualization I have picked the following data set...

From: Stats Stack Exchange | By: user2691544 | Sunday, April 20, 2014

smile

frown

skip

I was wondering whether you could help me on this question that. I am not sure whether i am doing it correctly so any guidance from anyone would be most appreciated. I will post the full question so please do bear with me. Let X be a random varaible...

From: Stats Stack Exchange | By: Ingrid | Sunday, April 20, 2014

smile

frown

skip

I am working on a project where I have to do multi-label text classification. I want to understand that whether my approach is correct or I am missing something. I am using R to do it. Clean the text Create a corpus. While creating the corpus I am removing...

From: Stats Stack Exchange | By: tanay | Sunday, April 20, 2014

smile

frown

skip

I have $n$ dice with $m$ sides. The $i^{th}$ dice will show value $0 \leq x_i \leq m-1$ with probability $0 \leq D_i(x_i) \leq 1$. What is the probability that the sum of the dice equals $\alpha$ Is there some approximation for $P(\alpha)$...

From: Stats Stack Exchange | By: IndustryMinion | Sunday, April 20, 2014

smile

frown

skip

I am using LibSVM (3.18) as an implementation of SVM. But every time when I'm predicting the result, it's giving zero. I am following these instructions: I have CSV file (+50K lines), Most of data in column (target) is zeros, the other values are between...

From: Stats Stack Exchange | By: user3378649 | Sunday, April 20, 2014

smile

frown

skip

I am looking for a method or package in R that can remove heteroscedasticity from time series. Specifically, I have a number of time series $$Z = (Z_1, \ldots, Z_p)$$ where $Z_j = \{(Z_1)_t\}_{t=1}^{T}$ to which I want to fit a VAR model. Each time series...

From: Stats Stack Exchange | By: Stijn | Sunday, April 20, 2014

smile

frown

skip

I have collected data from 88 human subjects. There is two subject groups, A (test) and B (control). Number of subjects in each group is 44. The subjects are paired between groups. There is two measurements from each subject, one before, and one after...

From: Stats Stack Exchange | By: Guest | Sunday, April 20, 2014

smile

frown

skip

I have a predictor with responses from 140 people in group A and 60 in group B. My mediator only uses responses from group A, and my outcome variable uses responses from Groups A, B, C $(n=31)$. What type of analysis do I need to run? What software would...

From: Stats Stack Exchange | By: Uriah07 | Sunday, April 20, 2014

smile

frown

skip

I collected some data on a species of goose called Brent Goose over the winter. A csv file of the data can be downloaded from Dropbox or imported straight into R with this code: library(repmis) goose_behaviour <- repmis::source_DropboxData("goose_behaviour.csv",...

From: Stats Stack Exchange | By: luciano | Sunday, April 20, 2014

smile

frown

skip

According to my understanding, when we has unknown population mean and variance, we has to estimate its population variance through sample variance and use t distribution to estimate the potential range of population mean using estimated population variance...

From: Stats Stack Exchange | By: user3420399 | Sunday, April 20, 2014

smile

frown

skip

Following Hofert et al.'s paper "Likelihood inference for Archimedean copulas in high dimensions under known margins," (http://dl.acm.org/citation.cfm?id=2263953) I wrote a script in Matlab to produce estimates of Archimidean copulas in high dimensions....

From: Stats Stack Exchange | By: Sonntag | Sunday, April 20, 2014

smile

frown

skip

I have the following hourly time series data and would like to fit a best fit line to it: There seems to be a periodicity on a daily basis and a weekly basis. By this, I mean there are patterns that repeat every day (e.g. peaks during 7PM) and patterns...

From: Stats Stack Exchange | By: mchangun | Sunday, April 20, 2014

smile

frown

skip

What is the difference between compositional data model using additive log-ratio (alr) transformation and aggregated multinomial logit model?

From: Stats Stack Exchange | By: Surveyor | Sunday, April 20, 2014

smile

frown

skip

I have a question about Arellano-Bond model in Stata (xtabond/xtabond2). The slopes I get, are they for levels or differences of values? My model to be estimated has a form of (D is first difference): DY=a+DX1+DX2+.... So should I use already differentiated...

From: Stats Stack Exchange | By: Risto | Sunday, April 20, 2014

smile

frown

skip

I've been reading the Wikipedia page for Levene's test, and it cites the degrees of freedom as (k - 1, N - k), where k is the number of different groups to which the sampled cases belong, and N is the total number of cases in all groups. However, it...

From: Stats Stack Exchange | By: Sasha | Sunday, April 20, 2014

smile

frown

skip

What is the difference between finite and infinite variance ? My stats knowledge is rather basic; Wikipedia / Google wasn't much help here.

From: Stats Stack Exchange | By: AfterWorkGuinness | Saturday, April 19, 2014

smile

frown

skip

I'm working on a review paper and need to collect the means and standard deviations of a given measure (such as a measure of depression) from papers of interest. However, some authors report means and standard deviations for each item on the measure,...

From: Stats Stack Exchange | By: user30295 | Saturday, April 19, 2014

smile

frown

skip

I read an article that says the dependent variables in a regression model must be normally distributed. The way i understand it, is that the observations for the regression model must then be normally distributed. Or in other words if i choose sample...

From: Stats Stack Exchange | By: Jason Samuels | Saturday, April 19, 2014

smile

frown

skip

From this video by Andrew Ng around 5:00 How are $\delta_3$ and $\delta_2$ derived? In fact, what does $\delta_3$ even mean? $\delta_4$ is got by comparing to y, no such comparison is possible for the output of a hidden layer, right?...

From: Stats Stack Exchange | By: qed | Saturday, April 19, 2014

smile

frown

skip

When comparing feature-based classification techniques what characteristics about the different processes should be considered? I'm comparing different classification techniques to try to figure out what should be considered when selecting a classification...

From: Stats Stack Exchange | By: HardcoreBro | Saturday, April 19, 2014

smile

frown

skip

Let $\mathcal{H}\colon\mathbf{w}\cdot\mathbf{x}+b=0$ be a separating hyperplane, which some binary linear classifier results in. Let $\mathbf{x}_t$ be an unseen, new sample that appears and needs to be classified. We can predict the truth label of $\mathbf{x}_t$...

From: Stats Stack Exchange | By: nullgeppetto | Saturday, April 19, 2014

smile

frown

skip

I hope I am asking this in a way that makes sense. I am comparing 8 means and want to set up a planned comparisons, rather than having my Bonferroni adjustment become overly-conservative in a post-hoc. For my groups I need to make a total of 16 comparisons,...

From: Stats Stack Exchange | By: Phillip | Saturday, April 19, 2014

smile

frown

skip

I estimated the mean and variance of two latent variables through two groups of data. I can't use the data to do hypothesis testing, because I am interested in the latent variable. Is there a way to test the whether the two latent variables are significantly...

From: Stats Stack Exchange | By: user258682 | Saturday, April 19, 2014

smile

frown

skip

For the following problem: $\text{min:}\ f(x)\\ s.t. \ g(x)\leq t$ Is the above problem equalivant to the following problem? $\text{min:}\ f(x) + \lambda g(x) \\ s.t. \ \lambda\geq0$ where $t$ and $\lambda$ are variables. It seems equalivant, because...

From: Stats Stack Exchange | By: user137273 | Saturday, April 19, 2014

smile

frown

skip

Assume a model like this, basically explaining stock market returns with a bunch of stuff: stockReturn(t) ~ bondReturn(t) + moneyMarketReturn(t) + inflation(t) + somethingElse(t) Does using inflation as an independent variable bring any significant problems?...

From: Stats Stack Exchange | By: Roope | Saturday, April 19, 2014

smile

frown

skip

In the questionnaire I asked respondents from two countries how many job offers they received from 5 sources in the last 6 months. There are 5 questions - one for each source. It is an open question, without a scale as the two countries strongly differ...

From: Stats Stack Exchange | By: Anna | Saturday, April 19, 2014

smile

frown

skip

I ran the same SEM model in sem and lavaan. I got the same parameters and - generally - very close test values, with the exception of AIC and BIC which were immensely different between the two packages. The following is the resulting AIC and BIC from...

From: Stats Stack Exchange | By: Deuterium | Saturday, April 19, 2014

smile

frown

skip

Suppose I have a big online company, and many of my customers churned (i.e. they were paying, and then stopped). My goal is to understand why each of them churned. First I identify the complete set of reasons for churning, $H_1,\ldots,H_n$. E.g. "the...

From: Stats Stack Exchange | By: Diego de Estrada | Saturday, April 19, 2014

smile

frown

skip

I have a variable whose value I can only measure at the end of life of a product (which is not fixed). The variable's value, continuous and between 0 and 100, may be related to its age at that time. My data consists of the various ages of a set of products...

From: Stats Stack Exchange | By: Sjoerd C. de Vries | Saturday, April 19, 2014

smile

frown

skip

I am talking about a situation in which I have several continuous predictor variables predicting a continuous outcome. One of the predictors has a very non-normal distribution and has some wild outliers. I intend the generalize the regression model to...

From: Stats Stack Exchange | By: Sasha | Saturday, April 19, 2014

smile

frown

skip

Good evening all, I am doing a self-study exercise, but have been quizzed by a part of the question on finding percentage points of a normal distribution. I fully understand the first part of the question and was able to find the answer, which corresponds...

From: Stats Stack Exchange | By: user1275515 | Saturday, April 19, 2014

smile

frown

skip

How do you interpret the results of a multivariate probit regression? Is it interpreted the same way as OLS?

From: Stats Stack Exchange | By: user44067 | Saturday, April 19, 2014

smile

frown

skip

I have a set of data with features of movies and features of users and a third matrix with ratings of user for each movie. I have to build a recommendation system for new users. Can you help me with the problem? I am not sure how to go about it. What...

From: Stats Stack Exchange | By: Sejal Shinde | Saturday, April 19, 2014

smile

frown

skip

I am looking for a python library or module function that allows me to estimate probability densities p(x) using the Parzen-window approach with a Gaussian kernel (with variable sigma, or 'window width') I managed to implement the Parzen-technique using...

From: Stats Stack Exchange | By: Sebastian Raschka | Saturday, April 19, 2014

smile

frown

skip

I m a PhD student in New Zealand. I need to determine the impact of lameness in milk yield of cows. I measured milk yield daily as well as I recorded the cows that were observed lame in any one day . I recorded data daily for 325 consecutive days. it...

From: Stats Stack Exchange | By: carolina diaz | Saturday, April 19, 2014

smile

frown

skip

Please forgive this silly answer, I'm fairly new to statistics. Consider this R code: a = c(1,2,3,4,3,2,3,4,5,5,6,5,4,3,4,5,6,7,8,7,6,6,5,6,7,10,9) b = c(10,9,7,6,5,6,7,8,4,6,6,5,4,5,6,5,4,5,6,7,5,4,4,5,4,3,2) mean((a - mean(a))*(b-mean(b))) [1] -2.42524...

From: Stats Stack Exchange | By: kamula | Saturday, April 19, 2014

smile

frown

skip

Negative Binomial distribution can be parameterized using mean, $\mu$, and overdispersion $\psi$, so that the variance of NB is $\mu + \frac{\mu^2}{\psi}$. We know there is no analytical solution for estimating $\psi$. I understand the variance of NB...

From: Stats Stack Exchange | By: user258682 | Friday, April 18, 2014

smile

frown

skip

my problem is that I want to implement a Parzen-window estimation for a Gaussian Kernel, but I have a problem understanding how I can check whether a point (2D or 3D) lies within a Gaussian sphere. Given a set of sample points, I want to check how many...

From: Stats Stack Exchange | By: Sebastian Raschka | Friday, April 18, 2014

smile

frown

skip

- IP: Cruz and Rubio campaign for donors
- Mission to avoid a civil war in Ukraine
- Baghdad to Boston: Running in solidarity
- 'The ship listed too much'
- General Mills reverses course on right to sue after backlash ...
- Rubin 'Hurricane' Carter dies
- Mysterious fatal shooting in eastern Ukraine
- Stocks: It's report card time on Wall Street
- In celebration of 4/20: A tale of two cities
- A new way for the KKK?

- Alternative Energy
- In Vitro Fertilization
- Artificial Intelligence
- Manolo Blahnik
- iPhone
- Glenn Beck
- Recipes
- Machine Learning
- Carbon Footprint
- Couture
- Green Energy
- Alicia Keys
- Mount Everest
- Supreme Court
- Weight Loss
- Scams
- Energy Star
- Journalism
- Debt
- Afghanistan
- Healthcare
- Photography
- Pregnancy
- Advertising
- Parenting