"This report aims to introduce techniques and procedures of NLP (Natural Language Processing), the computational preparation and analysis of text data, to map the public voice and aid development. First, the report introduces essential concepts of communication and elaborates on the theoretical foundations of natural language analyses. Second, the report reviews research on NLP of social media text data by showcasing studies that have applied the techniques to the Sustainable Development Goals. Third, the report reviews specific NLP techniques, including data preprocessing, and dicusses libraries and programming procedures. It also reviews concepts such as keyword extraction to identify relevant terms, topic modeling to detect common themes, and text classification to recognize language features. These NLP techniques are showcased in two case studies. The first shows how topic modeling can be applied to derive insights on the public debate over climate change in Australia. The second demonstrates how text classification can be leveraged to analyze public sentiment on COVID-19 in the Philippines. Finally, the report discusses the challenges and limitations, as well as the potentials, of NLP." (Foreword)
Introduction, 1
Communication and Language, 8
Addressing the Sustainable Development Goals through Natural Language Processing, 16
Data Preparation, 28
Data Representation, 42
Text Analysis Using Natural Language Processing, 48
Text Classification, 74
Conclusions, 93