• Publication year: 1995–2024
• Content: Must include all four domains (AV, AI, ML, DL)
• Non-English papers (29)
• Non-article types (1769)
• Articles published in 2025 (63)
• Manually excluded irrelevant topics (37)
Final dataset: 2,228 articles.
• PRISMA 2020 methodology was followed.
• Articles were screened manually after keyword filtering.
• Duplicate and off-topic articles were removed.
• Figure 1 in the paper presents the PRISMA flow diagram.
• Biblioshiny 5.0 in R Studio 4.3.2 (for bibliometric analyses)
• Gensim (Python) for Latent Dirichlet Allocation (LDA)
• BERTopic (Python) using HDBSCAN clustering
Data Preprocessing for NLP
• Token normalization applied (e.g., “AI” → “artificial intelligence”)
• Domain-specific stopwords removed
• Grid search applied for LDA alpha/eta tuning
• BERTopic clustering tuned via minimum cluster/sample sizes
• Publication volume by year and country
• Most cited articles and normalized total citations (NTC)
• Bradford's Law core journals and H-index analysis
• Thematic evolution and keyword co-occurrence networks
• Topic modeling with LDA and BERTopic
• Collaboration maps for authors and countries