by EK.
Last Updated April 16, 2018 07:19 AM

I am doing a logistic regression to predict a binary output variable (say good and bad). One of the independent variable is categorical (City) and has about 30 distinct values or cities in it in it. For this variable (City) I want to compute the weight of evidence (WOE) for each of these 30 distinct cities. I am using the formula

$$ WOE = \frac{\log \text{No. of good records}}{\log \text{No. of bad records}}. $$

However this gives values of $\pm \infty$ if the number of good records is zero or the number of bad records is zero or one.

How should we handle such scenarios or is it ok to $\pm \infty$ and use it in the logistic regression?

- ServerfaultXchanger
- SuperuserXchanger
- UbuntuXchanger
- WebappsXchanger
- WebmastersXchanger
- ProgrammersXchanger
- DbaXchanger
- DrupalXchanger
- WordpressXchanger
- MagentoXchanger
- JoomlaXchanger
- AndroidXchanger
- AppleXchanger
- GameXchanger
- GamingXchanger
- BlenderXchanger
- UxXchanger
- CookingXchanger
- PhotoXchanger
- StatsXchanger
- MathXchanger
- DiyXchanger
- GisXchanger
- TexXchanger
- MetaXchanger
- ElectronicsXchanger
- StackoverflowXchanger
- BitcoinXchanger
- EthereumXcanger