statistics
I struggle to find a suitable method to analyse my data. That is a retrospective study, exploratory analysis, data was already collected in the past. I generated some dummy data, for 100 subjects (in ...

Angela Carollo from the Laboratory of Fertility and Well-Being at the Max Planck Institute for Demographic Research (MPIDR), successfully defended her doctoral thesis, “Statistical Analysis of Time-to-Event Data with Multiple Time Scales“, at Leiden University. In her thesis she introduces a statistical model that addresses an important gap in the literature on statistical models for time-to-even…
I am analyzing a Type I right-censored reliability dataset consisting of 36 observations from a fatigue-life experiment. The response variable is lifetime (number of cycles to failure). According to ...


I recently calibrated a recovery-rate model that had only two weak features. Its point accuracy was almost nothing — R² basically zero. I expected its uncertainty estimates to be junk too. They weren't: the 90% conformal prediction intervals covered ~89% of held-out outcomes. Valid, just wide . That surprised me enough to nail it down, because it contradicts a belief a lot of us carry around: "my…
I am analyzing a Type I right-censored reliability dataset consisting of 36 observations from a fatigue-life experiment. The response variable is lifetime (number of cycles to failure). According to ...

Abstract This book presents an intuitive, conceptually-driven introduction to statistical analysis, purposefully designed to bypass intimidating, redundant mathematical formulas (pp. 6, 9). Developed from the author’s extensive experience teaching senior-level experimental neuroscience, the text strips away technical ornamentation to focus strictly on the core logic and historical evolution of st…
I recently wrote some code to generate random matrices $A\in U(n)$ with the property that the entries in $A$, the entries in the normalised eigenvectors of $A$, and the eigenvalues, all lie in $\...

Is there an overview paper (or book, or chapter) on regularization "in general"? By "in general" I mean a treatment that goes beyond the Lasso, Ridge Regression and the Elastic Net ...


A single model hands you a single answer and no sense of how much it hinges on the dozens of choices buried inside it. The post I Built 11 Models to Predict the 2026 World Cup. They Crown Four Different Champions. appeared first on Towards Data Science .
Let $X$ be a random variable in $\mathbb{R}^d$ with law $\mu$. We denote by $\mathcal{P}(\mathbb{R}^d)$ the set of all Borel probability measures on $\mathbb{R}^d$. Assume that $\mu \in ...

We propose a family of weighted statistics based on the CUSUM process of the WLS residuals for the online detection of changepoints in a Random Coefficient Autoregressive model, using both the standard CUSUM and the Page-CUSUM process. We derive the asymptotics under the null of no changepoint for all possible weighing schemes, including the case of the standardized CUSUM, for which we derive a D…
This study proposes two novel time-varying model-averaging methods for time-varying parameter regression models. When the number of predictors is small, we propose a novel time-varying complete subset-averaging (TVCSA) procedure, where the optimal time-varying subset size is obtained by minimizing the local leave- h -out cross-validation criterion. The TVCSA method is asymptotically optimal for a…
We study the uniform convergence rates of nonparametric estimators for a probability density function and its derivatives when the density has a known pole. Such situations arise in some structural microeconometric models, for example, in auction, labor, and consumer search, where uniform convergence rates of density functions are important for nonparametric and semiparametric estimation. Existin…

"Do countries with higher GDP per capita also have longer life expectancy?" I built a tool that lets you explore questions like that across 48 countries by picking any two of five metrics as scatter-plot axes. Two implementation hinges: (1) metrics that span orders of magnitude (population: Singapore 5.6M to India 1,417M, a 250× range) must be plotted and correlated on a log scale or every point …
I am trying to test Hurst exponent in different time lag range. However, i got negative values in some time lag range which is weird, because the Hurst exponent should have values within the range from 0 to 1. This is the Python code to calculate the Hurst exponent: *calculate Hurst* lag1 = 2 lags = range(lag1, 20) tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags] plot(log(lags),…

Nature, Published online: 12 June 2026; doi:10.1038/d41586-026-01888-9 A new benchmark pitting AI against previously unseen maths problems shows systems still fall short of top human expertise.
Let's say I have a dataset library(tidyverse) set.seed(123) df = tibble( x = rnorm(1000), y = -0.1 * x + rnorm(1000, sd = 0.1), truncate = (y > (0.25*x -0.4)) & (y < 0.25*x+0.4)) ...

Just a question out of curiosity. I am reviewing a manuscript which used (forward) stepwise regression for selection of models for the prediction of metastases in a certain type of cancer. I am fully ...
research.ioSign up to keep scrolling
Create your feed subscriptions, save articles, keep scrolling.
