No user data is saved or collected by this site.
These calculators are not diagnostic. No medical action should be taken based on anything you read or see on this site. If you are concerned that you may have herpes you should consult a Doctor.
The creator of this tool does not condone the use of the tool/site for any purpose other than calculating the user’s own chance to have Herpes. The user should not input other people’s information without their explicit consent.
While effort has been made to create the most accurate tools possible, these tools use a relatively small number of inputs to calculate the chances you have herpes. Your life is much more complex than a few inputs and the accuracy of these tools is limited by this. While these are the best estimates based on the available data, they are only rough estimates.
These algorithms were made using data from the CDC National Health and Nutrition Examination Survey from 2009 to 2016. All information has been released publicly and contains no personally identifiable information.
The simple home page calculator uses binning of the NHANES data based on the selected demographics. The data is filtered based on the user selections and the percent of herpes positive individuals is displayed.
The advanced calculators use laboratory, demographics, and questionnaire CDC NHANES datasets merged over several years. Initially, Stata 16.0 was used to perform t-tests on all variables in the NHANES questionnaire and demographic datasets to determine which variables exhibited low attrition and were significantly correlated with herpes prevalence, with t-tests performed separately for HSV1 and HSV2. Traditional thresholds of significance were used to determine which variables should be included in regression analysis for each. Logistic regression analysis was then used to further refine the list of variables significantly correlated with herpes.
The final model specification was then used to train a logistic regression model using Sklearn in Python. One hot encoding was used to encode categorical data. The final data frames used were shape 4171x18 for HSV1 and 6077x20 for HSV2. The models were fine-tuned based on an iterative test-train-split approach with 100 random states and accuracy testing for many different logistic models.