Paper 034: Using the USAS Semantic Tagset to Explore Persuasive Language in Jeremy Taylor’s Holy Living and Holy Dying, 1650-1651

anthony September 30, 2020 Uncategorized 10 Comments

Paper 034: Using the USAS Semantic Tagset to Explore Persuasive Language in Jeremy Taylor’s Holy Living and Holy Dying, 1650-1651

THOMAS, Dax (Meiji Gakuin University, Japan)

Keywords: semantic tagging, persuasive language, Wmatrix

Abstract

This presentation reports on the initial stages of a study on persuasive language in two texts, Holy Living and Holy Dying, written by Jeremy Taylor in 1650 and 1651. The purpose of the study is two-fold: 1) to explore the persuasive language Taylor uses in his writing; and 2) to explore the usefulness of semantic tagging in this type of investigation. The corpus, consisting of the two Taylor texts (226,035 tokens), was first tagged with the USAS Semantic Tagset using the Wmatrix interface. A list of key concepts (semantic keyness) was generated using the Wmatrix interface and concordance lines were consulted for finer detail on the nature of the persuasive technique being used. Several persuasion techniques (such as “emotional appeal”, “attack”, “inclusive/exclusive language”) were selected and semantic tags were identified from the USAS tagset that related to each of these persuasion techniques. When exploring “emotional appeal”, for example, the “E” tag (Emotion) was used as a search item. While not all semantic tags resulted in useful search results, it was found that Taylor seemed to prefer negative persuasion techniques, such as appeals to fear and sadness in his writing. This is illustrated well by four out of the top five most frequent Emotion-related items being E4.1- (repentance), E5- (fear), E4.1- (sorrow), and E4.1- (sad). By working with a general-to-specific approach — that is, from persuasion technique, to general semantic concept category, to specific lexical item — elements of persuasion in the text could be readily identified.

Presentation video

Supplementary Information

None

Q&A live (Zoom) session

Link:	None.
Notes:	None.

10 Comments

yasutakeishii
September 30, 2020 at 8:05 am

This looks to be a very interesting paper. I hope you enjoy the conference! – Organizing committee
anthony Post author
October 2, 2020 at 3:10 am

@Dax, thank you for the presentation. I thought your comments on the advantages and value of using a semantic tagger to contribute to the analysis of discourse was interesting and convincing. I had a few questions when listening to your talk:
1) As keyness analysis is designed to show “words that are unusually frequent in the target corpus compared with the reference corpus”, why would you consider those to be *representative* themes in the target corpus? To explain, maybe positive language is frequent in both the target and reference corpus. As a result, it would *not* show up as a keyness factor.
2) This is related to 1), but why did you use the BNC as the reference corpus. Wouldn’t it be better to use a reference corpus of general English used at the time of the target writing?
3) Do you have any opinions on *why* the texts contained so many negative words. I think this is always a crucial part of a discourse study otherwise it becomes purely descriptive.

Thanks for the presentation!
- daxmthomas
  October 3, 2020 at 4:11 am
  
  Hi Laurence,
  
  Thanks so much for your comments and insightful questions. I’ll do my best to address them as fully as I can.
  
  1) As keyness analysis is designed to show “words that are unusually frequent in the target corpus compared with the reference corpus”, why would you consider those to be *representative* themes in the target corpus? To explain, maybe positive language is frequent in both the target and reference corpus. As a result, it would *not* show up as a keyness factor.
  
  I see. So then rather than saying that Taylor uses more negative language than positive language in his *own* texts, really we can only say that he uses more negative language than is found in the *reference corpora*. I think I had in mind that the BNC would be a fairly representative/broad presentation of both positive and negative language in general. Perhaps not a very good assumption to make. Thanks for pointing this out.
  
  2) This is related to 1), but why did you use the BNC as the reference corpus. Wouldn’t it be better to use a reference corpus of general English used at the time of the target writing?
  
  Thanks for pointing this out, too. Honestly, I think I was focussing too much on the reference corpus just being a broad sample of general writing that I simply neglected to consider the time period. (Also, I’m used to using the BNC for so many non-historical things that I suppose I just instinctively gravitated to it. A very poor excuse to be sure!) I certainly should have given this more thought. You’ve made me now very curious to redo the keyness analysis with a time-appropriate reference corpus. Just now I’ve rerun the keyness tool on the text using the CEECS (Corpus of Early English Correspondence Sampler; from the Oxford Text Archive) as the reference corpus. It contains writings (letters) much closer to the time period. Interestingly, the results are not too different from those obtained when using the BNC. (Screenshot here, if you’re interested: https://drive.google.com/file/d/1AvDogrqBdnlaxO6LA5QZcUVJQFMpE3Lv/view)
  
  3) Do you have any opinions on *why* the texts contained so many negative words. I think this is always a crucial part of a discourse study otherwise it becomes purely descriptive.
  
  I haven’t looked into this in great depth yet (that’s the next part of my study) but I have a couple of ideas.
  
  His negativity could have been influenced in part by:
  
  i) historical/social factors (the general chaos present at the time but more specifically the defeat of the Royalist forces, for which he fought)
  
  ii) religious factors (the state of Christian eschatology at the time)
  
  iii) personal factors (the death of his first wife around the time of writing of Holy Dying)
  
  Things I’m thinking of exploring that might shed some light on this as well:
  
  i) the relationship between religious discourse and criminal law (Is Taylor drawing on other genres?; punishment as salvation?)
  
  ii) Taylor’s apparent bleak outlook on the state of the church at the time: “I have lived to see religion painted upon banners, and thrust out of churches; and the temple turned into a tabernacle, and that tabernacle made ambulatory, and covered with skins of beasts and torn curtains…” (from Taylor’s dedication to Lord Vaughan in Holy Living)
  
  Thanks again, Laurence. Your comments and questions (as always) have given me a lot to think about, not only regarding this study (my first foray into studying historical texts with corpus tools), but to carry forth into future studies as well. Much appreciated.
iskwshin
October 3, 2020 at 2:34 am

Thank you for an interesting talk. I feel that corpus-based literature analysis is really a promising filed of study. One small question. USAS is based on Longman Lexicon of Contemporary English (1981), though it has been slightly modified by a Lancaster team. Tom McArthur’s framework is for the classification of the vocabulary in the current English. Do you think it is effective even for the study of the works in the 17C? Or what “new” semantic fields should be added to discuss the works in those days? (Shin Ishikawa, Kobe U)
- daxmthomas
  October 3, 2020 at 5:28 am
  
  Hi Shin,
  
  Thanks very much for your questions and for the interesting background on the USAS tagset. I’ll do my best to give you my thoughts on your questions.
  
  1. Do you think it is effective even for the study of the works in the 17C?
  
  Generally speaking (and without getting too philosophical about it), I think basic human nature is a relatively consistent thing over time. And it seems to me that *most* of the categories used in USAS are quite universal overall. So as far as the *categories* that are there are concerned, I think the tagset is generally effective.
  
  Having said that, some problems could, of course, occur with how specific words are assigned to those semantic categories. One example from my paper was the word “great” which was included in the “evaluation” category when the results in the concordance lines showed it was often being used in its “quantity” sense. While this particular case might not be a period-related problem, it shows the potential for problems with words that have different senses over time. The word “awful” for example, might fall into the “bad (negative evaluation)” category by today’s usage, but its 17th-century usage might better be placed in a category that relates more to its meaning of “inspiring awe”.
  
  Another place where there might be problems is in dealing with words that are completely out of use today. The only way the tagger would be able to handle these would be to leave them untagged. This leads to your second question.
  
  2. Or what “new” semantic fields should be added to discuss the works in those days?
  
  I’m not sure that “new” semantic fields are strictly necessary, but I think it would be good to have an Early Modern English “plug in” that could be used when studying texts from this period. This could be optionally included together with the original USAS tagset when first tagging the file. More period-specific words relating to the categories of transportation, technology, food, religion, politics, and law could be included and would be quite welcome, I think.
  
  Thanks again for your questions. They’ve given me some very interesting things to consider for future projects.
  - iskwshin
    October 4, 2020 at 2:09 pm
    
    Thank you very much for your comment. Yes. I agree. Thanks!
yuyating
October 3, 2020 at 3:23 am

@Dax

Thank you for an interesting talk.

How do you deal with the mistagging from the USAS tool? I tried that tool before. It generated a lot of information which is irrelevant to my research questions and also a lot of mistagging.
- daxmthomas
  October 3, 2020 at 6:33 am
  
  Hi Yuyating,
  
  Thanks very much for your question. I’ll do my best to answer.
  
  1. How do you deal with the mistagging from the USAS tool?
  
  At the moment, the only way I can think of to deal with mistagging is to go to the concordance lines and catch them manually. However, on Wmatrix, I think this is only possible when using the keyness tool. When you are doing a regular search for semantic tags directly on Wmatrix, the output presents a link to the concordance lines for those words. However, those concordance lines are not filtered by tag and so include *all* senses of the word, not just those carrying the search tag. (Perhaps this will change in the future?)
  
  Having said that, it seems to me that mistags can happen not only because of the tagger (often using the wrong sense of a word; I used “great” as an example in this study), but also because of quirks in the corpus text (again here, the short form for “Chapter” – “CHAP” being used in the text, and then being read as a synonym for “man”). So, perhaps another way to help lower the chance of mistagging might be to take extra time to groom your corpus and make sure the text is standardized (for abbreviations and spelling) to the best of your ability.
  
  These are hardly complete solutions, I know, but I confess to not knowing much about what goes on “under the hood” of a tagger so I’m afraid I can’t make any suggestions on how to improve accuracy from that side of the process.
  
  I don’t know if this helped much. Hopefully, you’ll have better luck with USAS in the future.
  
  Thanks again for your helpful question.
RawsthorneMat
October 3, 2020 at 6:34 am

I really appreciated your description of creating a specific tagset for persuasion and careful evaluation. Have you looked at triangulating results with argumentation mining tools from Natural Language Processing?
daxmthomas
October 3, 2020 at 11:20 am

Hi Rawsthornemat,

Thanks very much for your comments. I’ve always thought of NLP as being pretty much in the realm of machine learning and AI so haven’t really gotten into it. I don’t know much about augmentation mining tools though I imagine things like topic modeling and sentiment analysis could be right in line with what I’m looking at here. Is there any particular software you’d recommend for someone new to NLP to begin investigating this?

Thanks again for your comments.

Comments are closed.