Natural language processing and Bert for social network authorprofiling X

Authors

  • Ivan Petrlik Azabache
  • Ciro Rodríguez Rodríguez
  • Pedro Lezama Gonzales
  • Luz Torres-Talaverano
  • Enma Graciela Vásquez Hurtado
  • Karina Inés Hinojosa Pedraza

Keywords:

Natural language, Bert , Profiling , Social Network X

Abstract

Today X has become one of the most important social
networks for expressing opinions and interests on the web.
The large amount of data generated allows automated
systems to profile users based on gender, nationality and
thematic interests. There are difficulties in this process not
only because of the short content, but also because of the
ambiguity and the use of several languages.
The goal of this proposal is to generate a deep learning
model using BERT that is able to identify demographic and
thematic attributes from tweets. Pre-trained models of the
BERT and Multilingual BERT type will be used, applied on PAN Author Profiling Task (CLEF 2019) corpora in English and Spanish.
The proposed work will deepen the analysis using supervised classification data for gender and nationality classification and topic extraction through unsupervised techniques, such as LDA and BERTopic. These options include preprocessing techniques, dimensional reduction (UMAP) and evaluation using metrics such as precision and accuracy.
It is expected that the results of the analysis can demonstrate the applicability of BERT for automatic profiling in marketing, socio-political analysis and content personalization.

Downloads

Download data is not yet available.

Downloads

Published

2025-08-22

Similar Articles

1-10 of 107

You may also start an advanced similarity search for this article.