2024
| |
A Digital Cohort Approach for Social Media Monitoring: A Cohort Study of People Who Vape E-Cigarettes.
John W Ayers, Adam Poliak, Nikolas T Beros, Michael Paul, Mark Dredze, Michael Hogarth, Davey M Smith..
American Journal of Preventive Medicine (AJPM) 2024.
BibTex
@article{Ayers:2024aa,
abstract = {Introduction. The evidence hierarchy in public health emphasizes longitudinal studies, whereas social media monitoring relies on aggregate analyses. Authors propose integrating longitudinal analyses into social media monitoring by creating a digital cohort of individual account holders, as demonstrated by a case study analysis of people who vape. Methods. All English language X posts mentioning vape or vaping were collected from January 1, 2017 through December 31, 2020. The digital cohort was composed of people who self-reported vaping and posted at least 10 times about vaping during the study period to determine the (1) prevalence, (2) success rate, and (3) timing of cessation behaviors. Results. There were 25,112 instances where an account shared at least 10 posts about vaping, with 619 (95% CI= 616,622) mean person-days and 43,810,531 cumulative person-days of observation. Among a random sample of accounts, 39% (95% CI= 35,43) belonged to persons who vaped. Among this digital cohort, 27% (95% CI= 21,33) reported making a quit attempt. For all first quit attempts, 26% (95% CI= 19,33) were successful on the basis of their subsequent vaping posts. Among those with a failed first cessation attempt, 13% (95% CI= 6, 19) subsequently made an additional quit attempt, of whom 36% (95% CI= 11, 61) were successful. On average, a quit attempt occurred 531 days (95% CI= 474,588) after their first vaping-related post. If their quit attempt failed, any second quit attempt occurred 361 days (95% CI= 250, 474) after their first quit attempt. Conclusions. By aligning with standard epidemiologic surveillance practices, this approach can greatly enhance the usefulness of social media monitoring in informing public health decision making, such as yielding insights into the timing of cessation behaviors among people who vape.},
author = {John W. Ayers and Adam Poliak and Nikolas T. Beros and Michael Paul and Mark Dredze and Michael Hogarth and Davey M. Smith},
file = {https://doi.org/10.1016/j.amepre.2024.01.016},
journal = {American Journal of Preventive Medicine (AJPM)},
month = {July},
number = {1},
pages = {147-154},
title = {A Digital Cohort Approach for Social Media Monitoring: A Cohort Study of People Who Vape E-Cigarettes},
volume = {67},
year = {2024}
}
|
2023
| |
E-commerce licensing loopholes: a case study of online shopping for tobacco products following a statewide sales restriction on flavoured tobacco in California.
Eric C Leas, Tomas Mejorado, Raquel Harati, Shannon Ellis, Nora Satybaldiyeva, Nicolas Morales, Adam Poliak.
Tobacco Control 2023.
Abstract
Introduction: Retailer licensing programmes can be an effective method of enforcing tobacco control laws, but most programmes do not require e-commerce retailers to obtain licenses. California’s implementation of a statewide flavour restriction (Senate Bill 793 (SB-793)) in December 2022 enforced through its tobacco retailer licensing programme presented an opportunity to assess whether the exclusion of e-commerce in the definition of ‘tobacco retailer’ might have resulted in a shift in consumer behaviour towards e-commerce. Methods: To examine the association between SB-793 implementation and online shopping for tobacco, we collected weekly Google search rates related to online shopping for cigarettes and vaping products in California from January 2018 to May 2023. We compared observed rates of shopping queries after SB-793 implementation to counterfactual expected rates and prediction intervals (PI) calculated from autoregressive iterative moving average models fit to historical trends. Content analysis was performed on the search results to identify websites marketing flavoured vaping products and menthol cigarettes. Results: The week SB-793 was implemented, shopping queries were 194.4% (95% PI 100.8% to 451.5%) and 161.7% (95% PI 81.7% to 367.5%) higher than expected for cigarettes and vapes, respectively. Cigarette shopping queries remained elevated significantly for 11 weeks and vape shopping queries for 6 weeks. All search results contained links to websites that offered flavoured vaping products or menthol cigarettes to Californian consumers. Discussion: These findings raise concerns about potential loopholes in policy enforcement created by the absence of explicit regulations on e-commerce sales in retailer licensing programmes. Strengthening regulations to include e-commerce and monitoring e-commerce compliance are recommended to enhance the impact of laws enforced through retailer licensing programmes.
BibTex
@article{leas2023commerce,
title={E-commerce licensing loopholes: a case study of online shopping for tobacco products following a statewide sales restriction on flavoured tobacco in California},
author={Leas, Eric C and Mejorado, Tomas and Harati, Raquel and Ellis, Shannon and Satybaldiyeva, Nora and Morales, Nicolas and Poliak, Adam},
journal={Tobacco Control},
year={2023},
publisher={BMJ Publishing Group Ltd}
}
News Coverage:
[US News]
[Yahoo]
|
Evaluating Artificial Intelligence Responses to Public Health Questions.
John W Ayers, Zechariah Zhu, Adam Poliak, Eric C Leas, Mark Dredze, Michael Hogarth, Davey M Smith.
2023.
BibTex
@article{ayers2023evaluating,
title={Evaluating Artificial Intelligence Responses to Public Health Questions},
author={Ayers, John W and Zhu, Zechariah and Poliak, Adam and Leas, Eric C and Dredze, Mark and Hogarth, Michael and Smith, Davey M},
journal={JAMA Network Open},
volume={6},
number={6},
pages={e2317517--e2317517},
year={2023},
publisher={American Medical Association}
}
News Coverage:
[CNN]
[NY Post]
|
Evaluating Paraphrastic Robustness in Textual Entailment Models.
Dhruv Verma, Yash Kumar Lal, Shreyashee Sinha, Benjamin Van Durme and Adam Poliak.
ACL 2023.
Abstract
We present PaRTE, a collection of 1,126 pairs of Recognizing Textual Entailment (RTE) examples to evaluate whether models are robust to paraphrasing. We posit that if RTE models understand language, their predictions should be consistent across inputs that share the same meaning. We use the evaluation set to determine if RTE models’ predictions change when examples are paraphrased. In our experiments, contemporary models change their predictions on 8-16% of paraphrased examples, indicating that there is still room for improvement.
Data
Code
BibTex
@inproceedings{verma-etal-2023-evaluating,
title = "Evaluating Paraphrastic Robustness in Textual Entailment Models",
author = "Verma, Dhruv and
Lal, Yash Kumar and
Sinha, Shreyashee and
Van Durme, Benjamin and
Poliak, Adam",
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
month = jul,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.acl-short.76",
doi = "10.18653/v1/2023.acl-short.76",
pages = "880--892",
abstract = "We present PaRTE, a collection of 1,126 pairs of Recognizing Textual Entailment (RTE) examples to evaluate whether models are robust to paraphrasing. We posit that if RTE models understand language, their predictions should be consistent across inputs that share the same meaning. We use the evaluation set to determine if RTE models{'} predictions change when examples are paraphrased. In our experiments, contemporary models change their predictions on 8-16{\%} of paraphrased examples, indicating that there is still room for improvement.",
}
|
Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum.
John W Ayers, Adam Poliak, Mark Dredze, Eric C Leas, Zechariah Zhu, Jessica B Kelley, Dennis J Faix, Aaron M Goodman, Christopher A Longhurst, Michael Hogarth, Davey M Smith.
JAMA Internal Medicine 2023.
Data
BibTex
@article{10.1001/jamainternmed.2023.1838,
author = {Ayers, John W. and Poliak, Adam and Dredze, Mark and Leas, Eric C. and Zhu, Zechariah and Kelley, Jessica B. and Faix, Dennis J. and Goodman, Aaron M. and Longhurst, Christopher A. and Hogarth, Michael and Smith, Davey M.},
title = "{Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum}",
journal = {JAMA Internal Medicine},
year = {2023},
month = {04},
issn = {2168-6106},
doi = {10.1001/jamainternmed.2023.1838},
url = {https://doi.org/10.1001/jamainternmed.2023.1838},
eprint = {https://jamanetwork.com/journals/jamainternalmedicine/articlepdf/2804309/jamainternal\_ayers\_2023\_oi\_230030\_1681999216.70842.pdf}
}
News Coverage:
[The Hill]
[The Wall Street Journal]
[US News]
[Jerusalem Post]
|
Automated Discovery of Perceived Health-related Concerns about E-cigarettes from Reddit.
Alexandra DeLucia, Adam Poliak, Zechariah Zhu, Stephanie R Pitts, Mario Navarro, Sharareh Shojaie, John W Ayers, Mark Dredze.
Annual Meeting of the Society for Research on Nicotine and Tobacco 2023.
BibTex
@inproceedings{DeLucia:2023aa,
author = {Alexandra DeLucia and Adam Poliak and Zechariah Zhu and Stephanie R. Pitts and Mario Navarro and Sharareh Shojaie and John W. Ayers and Mark Dredze},
booktitle = {Annual Meeting of the Society for Research on Nicotine and Tobacco},
keywords = {abstract},
title = {Automated Discovery of Perceived Health-related Concerns about E-cigarettes from Reddit},
year = {2023}
}
|
2022
| |
Internet Searches for Abortion Medications Following the Leaked SCOTUS Draft Ruling.
Adam Poliak, Nora Satybaldiyeva, Steffanie A. Strathdee, Eric C. Leas, Ramesh Rao, Davey Smith, John W. Ayers.
JAMA Internal Medicine 2022.
Abstract
On May 2, 2022, a draft Supreme Court of the United States (SCOTUS) majority opinion was leaked, foreshadowing the decision to overturn the 1973 Roe V Wade decision and allow states to further restrict or ban abortions. Concerns about lost access to legal abortions may lead to the public educating themselves about how to obtain abortion services. We evaluated whether internet searches for abortion medications increased following the leak.
BibTex
@article{10.1001/jamainternmed.2022.2998,
author = {Poliak, Adam and Satybaldiyeva, Nora and Strathdee, Steffanie A. and Leas, Eric C. and Rao, Ramesh and Smith, Davey and Ayers, John W.},
title = "{Internet Searches for Abortion Medications Following the Leaked Supreme Court of the United States Draft Ruling}",
journal = {JAMA Internal Medicine},
year = {2022},
month = {06},
abstract = "{On May 2, 2022, a draft Supreme Court of the United States (SCOTUS) majority opinion was leaked, foreshadowing the decision to overturn the 1973 Roe V Wade decision and allow states to further restrict or ban abortions. Concerns about lost access to legal abortions may lead to the public educating themselves about how to obtain abortion services. We evaluated whether internet searches for abortion medications increased following the leak.}",
issn = {2168-6106},
doi = {10.1001/jamainternmed.2022.2998},
url = {https://doi.org/10.1001/jamainternmed.2022.2998},
eprint = {https://jamanetwork.com/journals/jamainternalmedicine/articlepdf/2793813/jamainternal\_poliak\_2022\_ld\_220023\_1656357378.52325.pdf},
}
News Coverage:
[Philadelphia Inquirer]
[The Guardian]
[Daily Mail]
[Politico]
[Insider]
[Yahoo]
[ABC News]
[Today]
[CNN]
|
On Gender Biases in Offensive Language Classification Models.
Sanjana Marcé, Adam Poliak.
Workshop on Gender Bias in Natural Language Processing @ NAACL 2022.
Abstract
We explore whether neural Natural Language Processing models trained to identify offensive language in tweets contain gender biases. We add historically gendered and gender ambiguous American names to an existing offensive language evaluation set to determine whether models’ predictions are sensitive or robust to gendered names. While we see some evidence that these models might be prone to biased stereotypes that men use more offensive language than women, our results indicate that these models’ binary predictions might not greatly change based upon gendered names.
BibTex
@inproceedings{marce-2022-genderbias,
title = "On Gender Biases in Offensive Language Classification Models"
author = "Marcé, Sanjana and Poliak, Adam",
booktitle = "Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing",
month = jul,
year = "2022",
address = "Seattle, Washington",
publisher = "Association for Computational Linguistics",
}
|
A Machine Learning Approach For Discovering Tobacco Brands, Products, and Manufacturers in the United States.
Adam Poliak, Paiheng Xu, Eric Leas, Mario Navarro, Stephanie Pitts, Andie Malterud, John W Ayers, Mark Dredze.
Annual Meeting of the Society for Research on Nicotine and Tobacco 2022.
BibTex
@inproceedings{Poliak:2022wa,
author = {Adam Poliak and Paiheng Xu and Eric Leas and Mario Navarro and Stephanie Pitts and Andie Malterud and John W Ayers and Mark Dredze},
booktitle = {Annual Meeting of the Society for Research on Nicotine and Tobacco},
date-added = {2022-02-11 10:19:54 -0500},
date-modified = {2022-02-11 10:21:15 -0500},
keywords = {abstract},
title = {A Machine Learning Approach For Discovering Tobacco Brands, Products, and Manufacturers in the United States},
year = {2022}
}
|
2021
| |
Characterizing Test Anxiety on Social Media.
Esha Julka, Olivia Kowalishin, Jalisha B. Jenifer, Adam Poliak.
WiNLP 2021.
|
Discovering Changes in Birthing Narratives During COVID-19.
Daphna Spira, Noreen Mayat, Caitlin Dreisbach, Adam Poliak.
WiNLP 2021.
|
Figurative Language in Recognizing Textual Entailment.
Tuhin Chakrabarty, Debanjan Ghosh, Adam Poliak, Smaranda Muresan.
Findings of ACL 2021.
Abstract
We introduce a collection of recognizing textual entailment (RTE) datasets focused on figurative language. We leverage five existing datasets annotated for a variety of figurative language – simile, metaphor, and irony – and frame them into over 12,500 RTE examples.We evaluate how well state-of-the-art models trained on popular RTE datasets capture different aspects of figurative language. Our results and analyses indicate that these models might not sufficiently capture figurative language, struggling to perfor pragmatic inference and reasoning about world knowledge. Ultimately, our datasets provide a challenging testbed for evaluating RTE models.
Code
BibTex
@inproceedings{chakrabarty-etal-2021-figurative,
title = "Figurative Language in Recognizing Textual Entailment",
author = "Chakrabarty, Tuhin and
Ghosh, Debanjan and
Poliak, Adam and
Muresan, Smaranda",
booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
month = aug,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.findings-acl.297",
doi = "10.18653/v1/2021.findings-acl.297",
pages = "3354--3361",
}
|
Fine-tuning Transformers for Identifying Self-Reporting Potential Cases and Symptoms of COVID-19 in Tweets.
Max Fleming, Priyanka Dondeti, Caitlin Dreisbach, Adam Poliak.
Social Media Mining for Health Applications 2021 (Shared Task) 2021.
Abstract
Code
BibTex
|
An Immersive Computational Text Analysis Course for Non-Computer Science Students at Barnard College.
Adam Poliak, Jalisha Jenifer.
Fifth Workshop on Teaching NLP @ NAACL 2021 2021.
Abstract
We provide an overview of a new Computational Text Analysis course that will be taught at Barnard College over a six week period in May and June 2021. The course is targeted to non Computer Science at a Liberal Arts college that wish to incorporate fundamental Natural Language Processing tools in their research and studies. During the course, students will complete daily programming tutorials, read and review contemporary research papers, and propose and develop independent research projects
BibTex
@inproceedings{poliak-jenifer-2021-immersive,
title = "An Immersive Computational Text Analysis Course for Non-Computer Science Students at Barnard College",
author = "Poliak, Adam and
Jenifer, Jalisha",
booktitle = "Proceedings of the Fifth Workshop on Teaching NLP",
month = jun,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.teachingnlp-1.15",
pages = "92--95",
abstract = "We provide an overview of a new Computational Text Analysis course that will be taught at Barnard College over a six week period in May and June 2021. The course is targeted to non Computer Science at a Liberal Arts college that wish to incorporate fundamental Natural Language Processing tools in their re- search and studies. During the course, students will complete daily programming tutorials, read and review contemporary research papers, and propose and develop independent research projects.",
}
|
Suicide-Related Internet Searches During the Early Stages of the COVID-19 Pandemic in the US.
John W. Ayers, Adam Poliak, Derek C. Johnson, Eric C. Leas, Mark Dredze, Theodore Caputi, Alicia L. Nobles.
JAMA Network Open 2021.
Abstract
Experts anticipate that the societal fallout associated with the coronavirus disease 2019 (COVID-19) pandemic will increase suicidal behavior, and strategies to address this anticipated increase have been woven into policy decision-making without contemporaneous data.For instance, President Trump cited increased suicides as an argument against COVID-19 control measures during the first presidential debate on September 29, 2020. Given the time delays inherent in traditional population mental health surveillance, it is important for decision-makers to seek other contemporaneous data to evaluate potential associations. To assess the value that free and public internet search query trends can provide to rapidly identify associations, we monitored suicide-related internet search rates during the early stages of the COVID-19 pandemic in the US.
BibTex
@article{ayers2021suicide,
title={Suicide-Related Internet Searches During the Early Stages of the COVID-19 Pandemic in the US},
author={Ayers, John W and Poliak, Adam and Johnson, Derek C and Leas, Eric C and Dredze, Mark and Caputi, Theodore and Nobles, Alicia L},
journal={JAMA Network Open},
volume={4},
number={1},
pages={e2034261--e2034261},
year={2021},
publisher={American Medical Association}
}
News Coverage:
[NYTimes]
[Healio]
|
2020
| |
Revisiting Recognizing Textual Entailment for Evaluating Natural Language Processing Systems.
Adam Poliak.
PhD Thesis, Johns Hopkins University 2020.
Abstract
Recognizing Textual Entailment (RTE) began as a unified framework to evaluate the reasoning capabilities of Natural Language Processing (NLP) models. In recent years, RTE has evolved in the NLP community into a task that researchers focus on developing models for. This thesis revisits the tradition of RTE as an evaluation framework for NLP models, especially in the era of deep learning. Chapter 2 provides an overview of different approaches to evaluating NLP systems, discusses prior RTE datasets, and argues why many of them do not serve as satisfactory tests to evaluate the reasoning capabilities of NLP systems. Chapter 3 presents a new large-scale diverse collection of RTE datasets (DNC) that tests how well NLP systems capture a range of semantic phenomena that are integral to understanding human language. Chapter 4 demonstrates how the DNC can be used to evaluate reasoning capabilities of NLP models. Chapter 5 discusses the limits of RTE as an evaluation framework by illuminating how existing datasets contain biases that may enable crude modeling approaches to perform surprisingly well. The remaining aspects of the thesis focus on issues raised in Chapter 5. Chapter 6 addresses issues in prior RTE datasets focused on paraphrasing and presents a high-quality test set that can be used to analyze how robust RTE systems are to paraphrases. Chapter 7 demonstrates how modeling approaches on overcoming biases, e.g. adversarial learning, can enable RTE models overcome biases discussed in Chapter 5. Chapter 8 applies these methods to the task of discovering emergency needs during disaster events.
BibTex
@PhdThesis{poliak:2020:thesis,
author = {Adam Poliak},
title = {Revisting Recognizing Textual Entailment for Evaluating Natural Language Processing Systems},
school = {Johns Hopkins University},
year = {2020},
}
|
A Survey on Recognizing Textual Entailment as an NLP Evaluation.
Adam Poliak.
Evaluation and Comparison of NLP Systems (Eval4NLP) 2020.
Abstract
Recognizing Textual Entailment (RTE) was proposed as a unified evaluation framework to compare semantic understanding of differ- ent NLP systems. In this survey paper, we provide an overview of different approaches for evaluating and understanding the reason- ing capabilities of NLP systems. We then focus our discussion on RTE by highlighting prominent RTE datasets as well as advances in RTE dataset that focus on specific linguis- tic phenomena that can be used to evaluate NLP systems on a fine-grained level. We con- clude by arguing that when evaluating NLP systems, the community should utilize newly introduced RTE datasets that focus on specific linguistic phenomena.
Video
BibTex
@inproceedings{RTE-survey-evalNLP,
title={A Survey on Recognizing Textual Entailment as an NLP Evaluation},
author={bf Poliak, Adam},
year={2020},
booktitle={First Workshop on Evaluation and Comparison for NLP systems (Eval4NLP)}
}
|
Temporal Reasoning in Natural Language Inference.
Siddharth Vashishtha, Adam Poliak, Yash Kumar Lal, Benjamin Van Durme, Aaron Steven White.
Findings of EMNLP 2020.
Abstract
We introduce five new natural language inference (NLI) datasets focused on temporal reasoning. We recast four existing datasets annotated for even duration—how long an event lasts—and event ordering—how events are temporally arranged—into more than one million NLI examples. We use these datasets to investigate how well neural models trained on a popular NLI corpus capture these forms of temporal reasoning.
Code
BibTex
@inproceedings{Temporal-NLI--MNLP20,
title={Temporal Reasoning in Natural Language Inference},
author={Vashishtha, Siddharth and {\bf Poliak}, {\bf Adam} and {Kumar Lal}, Yash and {Van Durme}, Benjamin and White, {Aaron Steven}},
year={2020},
booktitle={Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
publisher = {Association for Computational Linguistics},
month= {nov}
}
|
Quantifying Public Interest in Police Reforms by Mining Internet Search Data Following George Floyd's Death.
John W Ayers, Benjamin M Althouse, Adam Poliak, Eric C Leas, Alicia L Nobles, Mark Dredze, Davey Smith.
Journal of Medical Internet Research (JMIR) 2020.
Abstract
Background: The death of George Floyd while in police custody has resurfaced serious questions about police conduct that result in the deaths of unarmed persons. Objective: Data-driven strategies that identify and prioritize the public's needs may engender a public health response to improve policing. We assessed how internet searches indicative of interest in police reform changed after Mr Floyd's death. Methods: We monitored daily Google searches (per 10 million total searches) that included the terms ``police'' and ``reform(s)'' (eg, ``reform the police,'' ``best police reforms,'' etc) originating from the United States between January 1, 2010, through July 5, 2020. We also monitored searches containing the term ``police'' with ``training,'' ``union(s),'' ``militarization,'' or ``immunity'' as markers of interest in the corresponding reform topics. Results: The 41 days following Mr Floyd's death corresponded with the greatest number of police ``reform(s)'' searches ever recorded, with 1,350,000 total searches nationally. Searches increased significantly in all 50 states and Washington DC. By reform topic, nationally there were 1,220,000 total searches for ``police'' and ``union(s)''; 820,000 for ``training''; 360,000 for ``immunity''; and 72,000 for ``militarization.'' In terms of searches for all policy topics by state, 33 states searched the most for ``training,'' 16 for ``union(s),'' and 2 for ``immunity.'' States typically in the southeast had fewer queries related to any police reform topic than other states. States that had a greater percentage of votes for President Donald Trump during the 2016 election searched more often for police ``union(s)'' while states favoring Secretary Hillary Clinton searched more for police ``training.'' Conclusions: The United States is at a historical juncture, with record interest in topics related to police reform with variability in search terms across states. Policy makers can respond to searches by considering the policies their constituencies are searching for online, notably police training and unions. Public health leaders can respond by engaging in the subject of policing and advocating for evidence-based policy reforms.
BibTex
@Article{info:doi/10.2196/22574,
author="Ayers, John W and Althouse, Benjamin M and Poliak, Adam and Leas, Eric C and Nobles, Alicia L and Dredze, Mark and Smith, Davey",
title="Quantifying Public Interest in Police Reforms by Mining Internet Search Data Following George Floyd's Death",
journal="J Med Internet Res",
year="2020",
month="Oct",
day="21",
volume="22",
number="10",
pages="e22574",
keywords="policing; digital health, bioinformatics; public health; public interest; data mining; internet; search; trend; Google Trends",
abstract="Background: The death of George Floyd while in police custody has resurfaced serious questions about police conduct that result in the deaths of unarmed persons. Objective: Data-driven strategies that identify and prioritize the public's needs may engender a public health response to improve policing. We assessed how internet searches indicative of interest in police reform changed after Mr Floyd's death. Methods: We monitored daily Google searches (per 10 million total searches) that included the terms ``police'' and ``reform(s)'' (eg, ``reform the police,'' ``best police reforms,'' etc) originating from the United States between January 1, 2010, through July 5, 2020. We also monitored searches containing the term ``police'' with ``training,'' ``union(s),'' ``militarization,'' or ``immunity'' as markers of interest in the corresponding reform topics. Results: The 41 days following Mr Floyd's death corresponded with the greatest number of police ``reform(s)'' searches ever recorded, with 1,350,000 total searches nationally. Searches increased significantly in all 50 states and Washington DC. By reform topic, nationally there were 1,220,000 total searches for ``police'' and ``union(s)''; 820,000 for ``training''; 360,000 for ``immunity''; and 72,000 for ``militarization.'' In terms of searches for all policy topics by state, 33 states searched the most for ``training,'' 16 for ``union(s),'' and 2 for ``immunity.'' States typically in the southeast had fewer queries related to any police reform topic than other states. States that had a greater percentage of votes for President Donald Trump during the 2016 election searched more often for police ``union(s)'' while states favoring Secretary Hillary Clinton searched more for police ``training.'' Conclusions: The United States is at a historical juncture, with record interest in topics related to police reform with variability in search terms across states. Policy makers can respond to searches by considering the policies their constituencies are searching for online, notably police training and unions. Public health leaders can respond by engaging in the subject of policing and advocating for evidence-based policy reforms. ",
issn="1438-8871",
doi="10.2196/22574",
url="http://www.jmir.org/2020/10/e22574/",
url="https://doi.org/10.2196/22574",
url="http://www.ncbi.nlm.nih.gov/pubmed/33084578"
}
News Coverage:
[Breitbart]
|
Internet Searches for Acute Anxiety During the Early Stages of the COVID-19 Pandemic.
John W Ayers, Eric C Leas, Derek C Johnson, Adam Poliak, Benjamin M Althouse, Mark Dredze, Alicia L Nobles.
JAMA Internal Medicine 2020.
Abstract
There is widespread concern that the coronavirus disease 2019 (COVID-19) pandemic may harm population mental health, chiefly owing to anxiety about the disease and its societal fallout. But traditional population mental health surveillance (eg, telephone surveys, medical records) is time consuming, expensive, and may miss persons who do not participate or seek care. To evaluate the association of COVID-19 with anxiety on a population basis, we examined internet searches indicative of acute anxiety during the early stages of the COVID-19 pandemic.
BibTex
@article{10.1001/jamainternmed.2020.3305,
author = {Ayers, John W. and Leas, Eric C. and Johnson, Derek C. and Poliak, Adam and Althouse, Benjamin M. and Dredze, Mark and Nobles, Alicia L.},
title = {Internet Searches for Acute Anxiety During the Early Stages of the COVID-19 Pandemic},
journal = {JAMA Internal Medicine},
year = {2020},
month = {08},
abstract = {There is widespread concern that the coronavirus disease 2019 (COVID-19) pandemic may harm population mental health, chiefly owing to anxiety about the disease and its societal fallout. But traditional population mental health surveillance (eg, telephone surveys, medical records) is time consuming, expensive, and may miss persons who do not participate or seek care. To evaluate the association of COVID-19 with anxiety on a population basis, we examined internet searches indicative of acute anxiety during the early stages of the COVID-19 pandemic.},
issn = {2168-6106},
doi = {10.1001/jamainternmed.2020.3305},
url = {https://doi.org/10.1001/jamainternmed.2020.3305},
eprint = {https://jamanetwork.com/journals/jamainternalmedicine/articlepdf/2769543/jamainternal\_ayers\_2020\_ld\_200047\_1597172436.19944.pdf}
}
News Coverage:
[CNN]
[CNBC]
[The Hill]
[Forbes]
[Yahoo]
[New York Post]
[Sky News (Italian)]
|
Collecting Verified COVID-19 Question Answer Pairs.
Adam Poliak, Max Fleming, Cash Costello, Kenton W Murray, Mahsa Yarmohammadi, Shivani Pandya, Darius Irani, Milind Agarwal, Udit Sharma, Shuo Sun, Nicola Ivanov, Lingxi Shang, Kaushik Srinivasan, Seolhwa Lee, Xu Han, Smisha Agarwal, João Sedoc.
NLP COVID-19 Workshop @EMNLP 2020.
Abstract
We release a dataset of over 2,200 COVID- 19 related Frequently asked Question-Answer pairs scraped from over 40 trusted websites. We include an additional 24,000 questions pulled from online sources that have been aligned by experts with existing answered questions from our dataset. This paper describes our efforts in collecting the dataset and summarizes the resulting data. Our dataset is automatically updated daily and available at https://covid-19-infobot.org/data/. So far, this data has been used to develop a chatbot providing users information about COVID-19. We encourage others to build analytics and tools upon this dataset as well.
Data
Code
BibTex
@inproceedings{Collecting+COVID_NLP2020,
title={Collecting Verified COVID-19 Question Answer Pairs},
author={Poliak, Adam and Fleming, Max and Costello, Cash and Murray, Kenton W and Yarmohammadi, Mahsa and Pandya, Shivani and Irani, Darius and Agarwal, Milind and Sharma, Udit and Sun, Shuo and others},
year={2020},
booktitle={NLP COVID-19 Workshop @EMNLP},
url={https://openreview.net/forum?id=GR03UfD2OZk}
}
|
Probing Neural Language Models for Human Tacit Assumptions.
Nathaniel Weir, Adam Poliak, Benjamin Van Durme.
CogSci 2020.
Abstract
Humans carry stereotypic tacit assumptions (STAs) (Prince, 1978), or propositional beliefs about generic concepts. Such associations are crucial for understanding natural language. We construct a diagnostic set of word prediction prompts to evaluate whether recent neural contextualized language models trained on large text corpora capture STAs. Our prompts are based on human responses in a psychological study of conceptual associations. We findmodels to be profoundly effective at retrieving concepts given associated properties. Our results demonstrate empirical evidence that stereotypic conceptual representations are captured in neural models derived from semi-supervised linguistic exposure.
Video
BibTex
@inproceedings{Weir-et-al:2020,
author = {Nathaniel Weir, Adam Poliak, Benjamin Van Durme},
title = {Probing Neural Language Models for Human Tacit Assumptions},
booktitle = {42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci)},
year = {2020},
url = {https://cognitivesciencesociety.org/cogsci20/papers/0070/0070.pdf}
}
|
Uncertain Natural Language Inference.
Tongfei Chen*, Zhengping Jiang*, Adam Poliak, Keisuke Sakaguchi, Benjamin Van Durme.
ACL 2020.
Abstract
We introduce Uncertain Natural Language Inference (UNLI), a refinement of Natural Language Inferenc (NLI) that shifts away from categorical labels, targeting instead the direct prediction of subjective probability assessments. We demonstrate the feasibility of collecting annotations for UNLI by relabeling a portion of the SNLI dataset under a probabilistic scale, where items even with the same categorical label differ in how likely people judge them to be true given a premise. We describe a direct scalar regression modeling approach, and find that existing categorically labeled NLI data can be used in pre-training. Our best models approach human performance, demonstrating models may be capable of more subtle inferences than the categorical bin assignment employed in current NLI tasks.
Data
Code
BibTex
@inproceedings{Chen-et-al:2020,
author = {Tongfei Chen*, Zhengping Jiang*, Adam Poliak, Keisuke Sakaguchi, Benjamin Van Durme},
title = {Uncertain Natural Language Inference},
booktitle = {Proceedings of The 58th Annual Meeting of the Association for Computational Linguistics (ACL)},
year = {2020},
url = {}
}
|
2019
| |
Adversarial Learning for Robust Emergency Need Discovery in Low Resource Settings.
Adam Poliak, Benjamin Van Durme.
West Coast NLP (WeCNLP) 2019.
BibTex
@article{poliak2019adv-wecnlp,
title={Adversarial Learning for Robust Emergency Need Discovery in Low Resource Settings},
author={Poliak, Adam and Van Durme, Benjamin},
journal={Second Annual West Coast NLP (WeCNLP) Summit},
year={2019}
}
|
Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference
.
Yonatan Belinkov*, Adam Poliak*, Stuart M. Shieber, Benjamin Van Durme, Alexander Rush.
ACL 2019.
Abstract
Natural Language Inference (NLI) datasets often contain hypothesis-only biases—artifacts that allow models to achieve non-trivial performance without learning whether a premise entails a hypothesis. We propose two probabilistic methods to build models that are more robust to such biases and better transfer across datasets. In contrast to standard approaches to NLI, our methods predict the probability of a premise given a hypothesis and NLI label, discouraging models from ignoring the premise. We evaluate our methods on synthetic and existing NLI datasets by training on datasets containing biases and testing on datasets containing no (or different) hypothesis-only biases. Our results indicate that these methods can make NLI models more robust to dataset-specific artifacts, transferring better than a baseline architecture in 9 out of 12 NLI datasets. Additionally, we provide an extensive analysis of the interplay of our methods with known biases in NLI datasets, as well as the effects of encouraging models to ignore biases and fine-tuning on target datasets.
Code
Video
BibTex
@inproceedings{mitigating-artificats-nli-acl2019,
title = {Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference},
author = {Belinkov, Yonatan and Poliak, Adam and Shieber, {Stuart M.} and {Van Durme}, Benjamin and Rush, Alexander},
year = {2019},
booktitle = {ACL}
}
|
On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference.
Yonatan Belinkov*, Adam Poliak*, Stuart M. Shieber, Benjamin Van Durme, Alexander Rush.
StarSem 2019.
Abstract
Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases. Adversarial learning may help models ignore sensitive biases and spurious correlations in data. We evaluate whether adversarial learning can be used in NLI to encourage models to learn representa- tions free of hypothesis-only biases. Our anal- yses indicate that the representations learned via adversarial learning may be less biased, with only small drops in NLI accuracy.
Code
BibTex
@inproceedings{on-adv-removal-hypothesis-only-bias-in-natural-language-inference,
title = {On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference},
author = {Belinkov, Yonatan and Poliak, Adam and Shieber, {Stuart M.} and {Van Durme}, Benjamin and Rush, Alexander},
year = {2019},
booktitle = {Joint Conference on Lexical and Computational Semantics (StarSem)}
}
|
Probing what different NLP tasks teach machines about function word comprehension.
Best Paper Award.
Najoung Kim, Roma Patel, Adam Poliak, Patrick Xia, Alex Wang, R. Thomas Mccoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel Bowman, Ellie Pavlick.
StarSem 2019.
Abstract
We introduce a set of nine challenge tasks that test for the understanding of function words. These tasks are created by structurally mutating sentences from existing datasets to target the comprehension of specific types of function words (e.g., prepositions, wh-words). Using these probing tasks, we explore the effects of various pretraining objectives for sentence encoders (e.g., language modeling, CCG supertagging and natural language inference (NLI)) on the learned representations. Our results show that pretraining on CCG—our most syntactic objective—performs the best on average across our probing tasks, suggesting that syntactic knowledge helps function word comprehension. Language modeling also shows strong performance, supporting its widespread use for pretraining state-of-the-art NLP models. Overall, no pretraining objective dominates across the board, and our function word probing tasks highlight several intuitive differences between pretraining objectives, e.g., that NLI helps the comprehension of negation.
Data
BibTex
@inproceedings{kimStarSem19,
title = {Probing What Different NLP Tasks Teach Machines about Function Word Comprehension},
booktitle = {Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*{SEM})},
author = {Kim, Najoung and Patel, Roma and Poliak, Adam and Wang, Alex and Xia, Patrick and McCoy, R. Thomas and Tenney, Ian and Ross, Alexis and Linzen, Tal and {Van Durme}, Benjamin and Bowman, Samuel R. and Pavlick, Ellie},
pdf = {https://www.aclweb.org/anthology/S19-1026},
year = {2019},
numpages = {15}
}
|
What do you learn from context? Probing for sentence structure in contextualized word representations.
Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick.
ICLR 2019.
Abstract
Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks. Building on recent token-level probing work, we introduce a novel edge probing task design and construct a broad suite of sub-sentence tasks derived from the traditional structured NLP pipeline. We probe word-level contextual representations from four recent models and investigate how they encode sentence structure across a range of syntactic, semantic, local, and long-range phenomena. We find that existing models trained on language modeling and translation produce strong representations for syntactic phenomena, but only offer comparably small improvements on semantic tasks over a non-contextual baseline.
BibTex
@inproceedings{
tenney2018what,
title={What do you learn from context? Probing for sentence structure in contextualized word representations},
author={Ian Tenney and Patrick Xia and Berlin Chen and Alex Wang and Adam Poliak and R Thomas McCoy and Najoung Kim and Benjamin Van Durme and Sam Bowman and Dipanjan Das and Ellie Pavlick},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=SJzSgnRcKX},
}
|
2018
| |
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation.
Adam Poliak, Aparajita Haldar, Rachel Rudinger, J. Edward Hu, Ellie Pavlick, Aaron Steven White, Benjamin Van Durme.
EMNLP 2018.
Abstract
We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. We refer to our collection as the DNC: Diverse Natural Language Inference Collection. The DNC is available online at https://www.decomp.net, and will grow over time as additional resources are recast and added from novel sources.
Data
Slides
Video
BibTex
@inproceedings{poliak2018emnlp-DNC,
title = {{Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation}},
author = {Poliak, Adam and Haldar, Aparajita and Rudinger, Rachel and Hu, J. Edward and Pavlick, Ellie and White, Aaron Steven and {Van Durme}, Benjamin},
booktitle = {Empirical Methods in Natural Language Processing (EMNLP)},
year = {2018}
}
|
Hypothesis Only Baselines in Natural Language Inference.
Best Paper Award.
Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, Benjamin Van Durme.
StarSem 2018.
Abstract
We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on 10 distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority-class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.
Code
Slides
BibTex
@inproceedings{hypothesis-only-baselines-in-natural-language-inference,
title = {{Hypothesis Only Baselines in Natural Language Inference}},
author = {Poliak, Adam and Naradowsky, Jason and Haldar, Aparajita and Rudinger, Rachel and {Van Durme}, Benjamin},
year = {2018},
booktitle = {Joint Conference on Lexical and Computational Semantics (StarSem)}
}
|
On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference.
Adam Poliak, Yonatan Belinkov, Jim Glass, Benjamin Van Durme.
NAACL 2018.
Abstract
We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena. We use these representations as features to train a natural language inference (NLI) classifier based on datasets recast from existing semantic annotations. In applying this process to a representative NMT system, we find its encoder appears most suited to supporting inferences at the syntax-semantics interface, as compared to anaphora resolution requiring world knowledge. We conclude with a discussion on the merits and potential deficiencies of the existing process, and how it may be improved and extended as a broader framework for evaluating semantic coverage.
Code
BibTex
@inproceedings{evaluating-fine-grained-semantic-phenomena-in-neural-machine-translation-encoders-using-entailment,
author = {Poliak, Adam and Belinkov, Yonatan and Glass, James and {Van Durme}, Benjamin},
title = {On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference},
year = {2018},
numpages = {7},
booktitle = {Proceedings of the Annual Meeting of the North American Association of Computational Linguistics (NAACL)}
}
|
Neural Variational Entity Set Expansion for Automatically Populated Knowledge Graphs.
Pushpendre Rastogi, Adam Poliak, Vince Lyzinski, and Benjamin Van Durme..
Information Retrieval Journal 2018.
Abstract
We propose Neural variational set expansion to extract actionable information from a noisy knowledge graph (KG) and propose a general approach for increasing the interpretability of recommendation systems. We demonstrate the usefulness of applying a variational autoencoder to the Entity set expansion task based on a realistic automatically generated KG.
Code
Video
BibTex
@inproceedings{neural-variational-entity-set-expansion-for-automatically-populated-knowledge-graphs,
title = {{Neural Variational Entity Set Expansion for Automatically Populated Knowledge Graphs}},
author = {Rastogi, Pushpendre and Poliak, Adam and Lyzinski, Vince and {Van Durme}, Benjamin},
year = {2018},
journal = {Information Retrieval Journal},
month = oct,
day = {25},
issn = {1573-7659},
doi = {10.1007/s10791-018-9342-1},
url = {https://rdcu.be/98BY},
numpages = {24}
}
|
2017
| |
CADET: Computer Assisted Discovery Extraction and Translation
.
Benjamin Van Durme, Tom Lippincott, Kevin Duh, Deana Burchfield, Adam Poliak, Cash Costello, Tim Finin, Scott Miller, James Mayfield, Philipp Koehn, Craig Harman, Dawn Lawrie, Chandler May, Max Thomas, Annabelle Carrell, Julianne Chaloux, Tongfei Chen, Alex Comerford, Mark Dredze, Benjamin Glass, Shudong Hao, Patrick Martin, Pushpendre Rastogi, Rashmi Sankepally, Travis Wolfe, Ying-Ying Tran and Ted Zhang.
IJCNLP 2017.
Abstract
Computer Assisted Discovery Extraction and Translation (CADET) is a workbench for helping knowledge workers find, label, and translate documents of interest. It combines a multitude of analytics together with a flexible environment for customizing the workflow for different users. This open-source framework allows for easy development of new research prototypes using a micro-service architecture based atop Docker and Apache Thrift.
Code
BibTex
@inproceedings{cadet-computer-assisted-discovery-extraction-and-translation,
title = {{CADET: Computer Assisted Discovery Extraction and Translation}},
author = {{Van Durme}, Benjamin and Lippincott, Tom and Duh, Kevin and Burchfield, Deana and Poliak, Adam and Costello, Cash and Finin, Tim and Miller, Scott and Mayfield, James and Koehn, Philipp and Harman, Craig and Lawrie, Dawn and May, Chandler and Thomas, Max and Chaloux, Julianne and Carrell, Annabelle and Chen, Tongfei and Comerford, Alex and Dredze, Mark and Glass, Benjamin and Hao, Shudong and Martin, Patrick and Sankepally, Rashmi and Rastogi, Pushpendre and Wolfe, Travis and Tran, Ying-Ying and Zhang, Ted},
booktitle = {Proceedings of the 8th International Conference on Natural Language Processing (IJCNLP): System Demonstrations},
year = {2017},
numpages = {4},
url = {http://www.aclweb.org/anthology/I17-3002},
keywords = {extraction,interactive,systems,select}
}
|
Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles.
Francis Ferraro, Adam Poliak, Ryan Cotterell and Benjamin Van Durme.
StarSem 2017.
Abstract
We study how different frame annotations complement one another when learning continuous lexical semantics. We learn the representations from a tensorized skip-gram model that consistently encodes syntactic-semantic content better, with multiple 10% gains over baselines.
Code
BibTex
@InProceedings{ferraro-EtAl:2017:starSEM,
author = {Ferraro, Francis and Poliak, Adam and Cotterell, Ryan and Van Durme, Benjamin},
title = {Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles},
booktitle = {Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)},
month = {August},
year = {2017},
address = {Vancouver, Canada},
publisher = {Association for Computational Linguistics},
pages = {97--103},
abstract = {We study how different frame annotations complement one another when learning continuous lexical semantics. We learn the representations from a tensorized skip-gram model that consistently encodes syntactic-semantic content better, with multiple 10% gains over baselines.},
url = {http://www.aclweb.org/anthology/S17-1011}
}
|
Efficient, Compositional, Order-sensitive n-gram Embeddings.
Adam Poliak*, Pushpendre Rastogi*, M. Patrick Martin and Benjamin Van Durme.
EACL 2017.
Abstract
We propose ECO: a new way to generate embeddings for phrases that is Efficient, Compositional, and Order-sensitive. Our method creates decompositional embeddings for words offline and combines them to create new embeddings for phrases in real time. Unlike other approaches, ECO can create embeddings for phrases not seen during training. We evaluate ECO on supervised and unsupervised tasks and demonstrate that creating phrase embeddings that are sensitive to word order can help downstream tasks.
Data
Code
BibTex
@inproceedings{poliak-etal-2017-efficient,
author = {Poliak, Adam and Rastogi, Pushpendre and Martin, {M. Patrick} and {Van Durme}, Benjamin},
title = {{Efficient, Compositional, Order-sensitive n-gram Embeddings}},
booktitle = {The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
year = {2017},
numpages = {6},
keywords = {semantics},
url = {http://www.aclweb.org/anthology/E17-2081}
}
|
Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis.
Ryan Cotterell, Adam Poliak, Benjamin Van Durme and Jason Eisner.
EACL 2017.
Abstract
The popular skip-gram model induces word embeddings by exploiting the signal from word-context coocurrence. We offer a new interpretation of skip-gram based on exponential family PCA-a form of matrix factorization to generalize the skip-gram model to tensor factorization. In turn, this lets us train embeddings through richer higher-order coocurrences, e.g., triples that include positional information (to incorporate syntax) or morphological information (to share parameters across related words). We experiment on 40 languages and show our model improves upon skip-gram.
Code
BibTex
@InProceedings{E17-2028,
author = "Cotterell, Ryan and Poliak, Adam and Van Durme, Benjamin and Eisner, Jason",
title = "Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis",
booktitle = "Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers",
year = "2017",
publisher = "Association for Computational Linguistics",
pages = "175--181",
location = "Valencia, Spain",
url = "http://aclweb.org/anthology/E17-2028"
}
|
Semantic Proto-Role Labeling.
Adam Teichert, Adam Poliak, Benjamin Van Durme, Matthew R. Gormley.
AAAI 2017.
Abstract
We present the first large-scale, corpus based verification of Dowty's seminal theory of proto-roles. Our results demonstrate both the need for and the feasibility of a property-based annotation scheme of semantic relationships, as opposed to the currently dominant notion of categorical roles.
BibTex
@inproceedings{semantic-proto-role-labeling,
title = {{Semantic Proto-Role Labeling}},
author = {Teichert, Adam and Poliak, Adam and {Van Durme}, Benjamin and Gormley, Matthew},
booktitle = {AAAI Conference on Artificial Intelligence (AAAI)},
year = {2017},
numpages = {7},
url = {https://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14997/14053}
}
|
Training Relation Embeddings under Logical Constraints.
Pushpendre Rastogi, Adam Poliak, Benjamin Van Durme.
KG4IR 2017.
Abstract
We present ways of incorporating logical rules into the construction of embedding based Knowledge Base Completion (KBC) systems. Enforcing ?logical consistency? in the predictions of a KBC system guarantees that the predictions comply with logical rules such as symmetry, implication and generalized transitivity. Our method encodes logical rules about entities and relations as convex constraints in the embedding space to enforce the condition that the score of a logically entailed fact must never be less than the minimum score of an antecedent fact. Such constraints provide a weak guarantee that the predictions made by our KBC model will match the output of a logical knowledge base for many types of logical inferences. We validate our method via experiments on a knowledge graph derived from WordNet.
BibTex
@inproceedings{training-relation-embeddings-under-logical-constraints,
title = {{Training Relation Embeddings under Logical Constraints}},
author = {Rastogi, Pushpendre and Poliak, Adam and {Van Durme}, Benjamin},
booktitle = {Proceedings of the First Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis (KG4IR)},
editor = {L. Dietz, C. Xiong, E. Meij},
year = {2017},
numpages = {7},
keywords = {semantics,inference},
url = {https://pdfs.semanticscholar.org/2341/78756b8bf3b2694671583084b22c76c47560.pdf}
}
|
Generating Automatic Pseudo-entailments from AMR Parses.
Adam Poliak and Benjamin Van Durme.
MASC-SLL 2017.
Abstract
We explore how to generate a textual inference dataset from Abstract Meaning Representations (AMR) (Banarescu et al., 2013). Various aspects of AMR make it problematic to automatically derive inference patterns. Therefore, our generated dataset instead answers questions regarding the relation between entities and actions in sentences. We refer to these answers as pseudo-entailments. From this dataset, it is possible to then extract entailments from sentences.
BibTex
@inproceedings{poliak2017generating,
title={Generating automatic pseudo-entailments from AMR parses},
author={Poliak, Adam and Van Durme, Benjamin},
booktitle={6th Mid-Atlantic Student Colloquium on Speech, Language and Learning},
year={2017}
}
|