📊 Data & Analysisadvancedregressionstatisticsmachine-learningdata-science

Run a Regression Analysis

Set up and interpret a regression analysis to understand what factors drive a business outcome.

The Prompt

prompt.txt
Help me run a regression analysis on the following data. Provide:
1. Model selection: linear / logistic / multiple / polynomial — with reasoning
2. Variable selection: which to include, interactions to consider
3. Python/R code to run the analysis
4. How to interpret the coefficients
5. Diagnostic checks: multicollinearity, residual plots, outlier influence
6. How to present results to non-technical stakeholders
7. Limitations and caveats

Research question: [WHAT ARE YOU TRYING TO PREDICT OR EXPLAIN?]
Outcome variable: [WHAT YOU'RE PREDICTING]
Predictor variables: [LIST YOUR INDEPENDENT VARIABLES]
Data size: [NUMBER OF OBSERVATIONS]
Language: [PYTHON (sklearn/statsmodels) / R]

Example Output

Multiple linear regression to predict 30-day LTV from signup characteristics. Code using statsmodels OLS with company_size, plan_at_signup, and days_to_first_api_call as predictors. Results: days_to_first_api_call has the strongest negative coefficient (-$45 LTV per day delayed). R² = 0.61. VIF scores all <5 (no multicollinearity).

FAQ

Which AI model is best for Run a Regression Analysis?

Claude Sonnet 4 — strong at statistics with clear plain-English interpretation.

How do I use the Run a Regression Analysis prompt?

Copy the prompt, replace the [BRACKETED] placeholders with your specific information, and paste into your preferred AI assistant (ChatGPT, Claude, Gemini, etc.). Multiple linear regression to predict 30-day LTV from signup characteristics. Code using statsmodels OLS with company_size, plan_at_signup, and days_to_first_api_call as predictors. Results: days_to_first_api_call has the strongest negative coefficient (-$45 LTV per day delayed). R² = 0.61. VIF scores all <5 (no multicollinearity).