METHODS: Text ads from Google searches in eight countries (Bahamas, Germany, India, Malaysia, Mexico, South Africa, United Arab Emirates, and United States) were collected in 2022, totaling 1,974 prepolicy and 3,262 post-policy ads, and analyzed in 2023. A gold standard database was established by two coders who labeled 707 ads, which trained five natural language processing models to label the ads, covering content and target demographics. The descriptive statistics and multivariable logistic models were applied to analyze content before versus after policy implementation, both globally and by country.
RESULTS: Vertex AI emerged as the best natural language processing model with the highest F1 score of 0.87. There were significant decreases from pre- to post-policy implementation in the prevalence of labels of "Racial or Ethnic Identification" and "Ingredients: Natural" by 47% and 66%, respectively. Notable differences were identified from pre- to post-policy implementation in India, Mexico, and Germany.
CONCLUSIONS: The study observed changes in skin-lightening product advertisement labels from pre- to post-policy implementation, both globally and within countries. Considering the influence of digital advertising on colorist norms, assessing digital ad policy changes is crucial for public health surveillance. This study presents a computational method to help monitor digital platform policies for consumer product advertisements that affect public health.
METHODS: Data were collected and analysed over three months from January 2023 using an ontology-based information extraction system (Semantic Hub). The system identified patient "stories" and extracted themes from online posts from January 2013 to March 2023, focusing on Korea and Taiwan by filtering the geographic location of users, the language used, and the local online platforms. Extracted texts were structured into knowledge graphs and analysed descriptively.
RESULTS: The patient voice was identified in 133,857 messages (9,620 patients) from the Naver online platform in Korea and included internet chat forums focused on macular degeneration. The most important factors for AMD treatments were effectiveness (1,632/3,401 mentions; 48%), price and access to insurance (33%), tolerability (10%) and doctor and clinic recommendations (9%). Treatment burden associated with intravitreal injection of vascular endothelial growth factor inhibitors related to tolerability (254/942 mentions; 27%), financial burden (20%), hospital selection (18%) and emotional burden (14%). In Taiwan, 444 messages were identified from Facebook, YouTube and Instagram. The success of treatment was judged by improvements in visual acuity (20/121 mentions; 16.5%), effect on oedema (10.7%), less distortion (9.1%) and inhibition of angiogenesis (5.8%). Tolerability concerns were rarely mentioned (26/440 mentions; 5.9%).
CONCLUSIONS: Digital Listening using Semantic-NLP can provide real-world insights from large amounts of internet data quickly and with low human labour cost. This allows healthcare companies to respond to the unmet needs of patients for effective and safe treatment and improved patient quality of life throughout the product lifecycle.
METHODS: The study focused on analyzing 3,200 comments from Weibo, concentrating on six prominent topics linked to women's marriage and fertility. These topics were treated as research cases. The research employed natural language processing techniques, such as sentiment orientation analysis, Word2Vec, and TextRank.
RESULTS: Firstly, the overall sentiment orientation of Chinese women toward marriage and fertility was largely pessimistic. Secondly, the factors contributing to this negative sentiment were categorized into four dimensions: social policies and rights protection, concerns related to parenting, values and beliefs associated with marriage and fertility, and family and societal culture.
CONCLUSION: Based on these outcomes, the study proposed a range of mechanisms and pathways to enhance women's sentiment orientation towards marriage and fertility. These mechanisms encompass safeguarding women and children's rights, promoting parenting education, providing positive guidance on social media, and cultivating a diverse and inclusive social and cultural environment. The objective is to offer precise and comprehensive reference points for the formulation of policies that align more effectively with practical needs.
METHODS: For experiments, the autopsy reports belonging to eight different causes of death were collected, preprocessed and converted into 43 master feature vectors using various schemes for feature extraction, representation, and reduction. The six different text classification techniques were applied on these 43 master feature vectors to construct a classification model that can predict the cause of death. Finally, classification model performance was evaluated using four performance measures i.e. overall accuracy, macro precision, macro-F-measure, and macro recall.
RESULTS: From experiments, it was found that that unigram features obtained the highest performance compared to bigram, trigram, and hybrid-gram features. Furthermore, in feature representation schemes, term frequency, and term frequency with inverse document frequency obtained similar and better results when compared with binary frequency, and normalized term frequency with inverse document frequency. Furthermore, the chi-square feature reduction approach outperformed Pearson correlation, and information gain approaches. Finally, in text classification algorithms, support vector machine classifier outperforms random forest, Naive Bayes, k-nearest neighbor, decision tree, and ensemble-voted classifier.
CONCLUSION: Our results and comparisons hold practical importance and serve as references for future works. Moreover, the comparison outputs will act as state-of-art techniques to compare future proposals with existing automated text classification techniques.