Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the calculation methods of Logistic regression sample size?

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Logistic regression sample size calculation methods, for this problem, this article details the corresponding analysis and solution, hoping to help more small partners who want to solve this problem to find a simpler and easier method.

Logistic regression is a widely used statistical model. In practice, many researchers often ignore the sample size requirement of Logistic regression, or simply pass the sample size problem by "the number of included subjects is sufficient." These practices make the exploration of the relationship between the main influencing factors and the outcome fail to combine the two types of errors in the study design stage. The following three Logistic regression sample size calculation methods are introduced with examples to help investigators reasonably complete the design and implementation of the study.

Logistic regression models are widely used in various disciplines, such as medicine, social sciences, machine learning, etc., mainly when the dependent variable is a categorical variable, especially when the dependent variable belongs to 0 1 variables. The parameter estimation method used in this model is maximum likelihood estimate (MLE), which requires sufficient sample size to ensure the accuracy of parameter estimation, and the estimation of sample size is a problem that often puzzles researchers. The following will summarize several commonly used sample size determination methods in binary Logistic regression analysis.

empirical method

The currently widely used method is the EPV (events variable) method, i.e., the number of events per independent variable, where an event represents the class of dependent variables with a smaller number. For example, to investigate the relationship between the incidence of gastric cancer and three life factors (X1 represents bad eating habits, X2 represents eating salty food and salty food, and X3 represents mental status), if the proportion of gastric cancer patients is 20%, then when the pseudo-EPV =10, because there are three covariates, the number of gastric cancer patients required is 10×3=30, and the total sample size required (gastric cancer patients and healthy controls) is 30÷20%=150 cases. When EPV is too small, separation is easy to occur. This phenomenon occurs when an argument is greater than a constant and the variable is associated with only one argument. For example, when X is a continuous variable, if X≤0, Y is always 1, then complete separation occurs, and the parameter estimation cannot converge, and the estimated value of the regression coefficient cannot be obtained. Another case is when X

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report