A novel algorithm uses a person’s internet search queries to potentially predict an impending stroke event, allowing for rapid or immediate preventive action, according to a recent study.
“We have developed an internet-based detection model that retrospectively identified subjects who later developed stroke according to self-reports,” the researchers said. “The usefulness of this system in real-life patients awaits validation in a clinical setting where this model must be tested against accurate diagnoses, information on stroke risk factors, comorbidities, drugs, and outcome.”
Retrieving query data from the Bing search engine, the researchers looked at search patterns of 285 people who had used the search engine for terms indicative of a previous stroke event (“I had a stroke” or “I was diagnosed with a stroke;” defined as the patient cohort). In parallel, an overall control cohort of 1,195 older individuals (aged ≥60 years) was also included and could be stratified into several subpopulations according to age group and clinical conditions.
The participants’ search queries were then evaluated according to several attributes that could represent cognitive ability: number of words per query, the likelihood of the string in the entire population, use of automatic spelling correction, number of new words used, farthest link clicked, and keyword use, taking into account personal references and the mention of relevant symptoms or drugs.
Based on the above criteria for the patient and control cohorts, receiver operating characteristic curve (ROC) analysis showed that using search query attributes achieved excellent differentiation. [J Med Internet Res 2021;23:e27084]
Moreover, stratifying the control cohort into subgroups according to different criteria did not diminish the performance of the search query model. Areas under the ROCs (AUC) showed that the model could continue to discriminate stroke patients from controls who were 60–64, 65–75, and ≥75 years of age. The same was true when control participants had been enrolled for other diseases, such as migraine, depression, and Alzheimer’s disease.
Cardiovascular control diseases, such as heart attack, hypertension, and migraine, seemed more difficult to distinguish from stroke when using the model, but ROC analysis nevertheless found acceptable AUCs.
The researchers found that the best attributes to separate the stroke from the control cohorts were, in descending order, the average query likelihood, the standard deviation of query likelihood, the average number of queries per session, the average number of spelling mistakes, and the average number of repeated queries.
“Our findings suggest that among internet users, stroke events are preceded by alterations in communication patterns that have been previously shown to reflect aspects of cognitive function,” the researchers said, suggesting that an unusual break in the stream of queries right before the user’s first mention of a stroke might be the prime time for the clinical cerebrovascular event.
“Assuming that our presumed time of stroke correctly approximates the time of the real event, the internet-derived signal of an impending stroke tends to gain strength as the time of the stroke approaches,” they said. “If monitored continuously, not only the signal per se but also its persistence and intensification with time might comprise a useful aid in attempted identification of an impending stroke.”