Using the TAC Biomedical Summarization Corpus (Min-Yen), we extracted 45 manually curated scholarly assertions (selection described below). Based on these, we generated three independently-executed Web-based questionnaires where researchers in the biomedical domain, from different linguistic groups but in comparable research institutes, were presented with assertions and asked to categorize the strength of those assertion into 4 (High, Medium High, Medium Low and Low), 3 (Category 1, Category 2 and Category 3) or 2 (Relatively High and Relatively Low) certainty levels over the three questionnaires. G Index (Holley & Guilford, 1964) coefficient analysis was applied to determine the degree of agreement between annotators. We then extracted the essential features of inter-rater agreement from the questionnaire data using Principal Component Analysis (PCA). Afterward, we categorized our collection of statements in clusters using k-means algorithm (Jolliffe, 2011) (Dunham, 2006). Finally, an automated classifier model was generated using deep-learning techniques over the results of this study (manuscript under preparation). This was used to construct exemplar scholarly assertions capturing certainty metadata, and published as machine-readable NanoPublication.