Filter Words from a Corpus by Attribute — filterWordsAttribute • tenet

The function extracts the relative position of an attribute associated to the metadata for each text of a corpus object.

Usage

filterWordsAttribute(corpus,
            docvar,
            value,
            aggr.by.var)

Arguments

corpus: A quanteda corpus object.
docvar: Metadata variable associated to the texts on a quanteda corpus object.
value: List of values contained in the metadata variable (docvar).
aggr.by.var: Establish the grouping variable to be used in the creation of a new aggregated corpus.

Details

The function searches for words based on values from a metadata variable and allows new aggregation of data according to another grouping variable. It helps to identify actors, institutions, attributes presented in the text and highlight their relative position in a more aggregated perspective.

Value

A data.frame containing the words retreived, their relative position in each text and the grouping variable of the new corpus.

Examples

if (FALSE) {
# Retrieve a corpus of text 
tx <- quanteda::data_corpus_inaugural

# find the relative position of keywords
filterWordsAttribute(corpus=tx, 
                     docvar="President", 
                     value=c("Nixon","Bush",
                              "Kennedy","Roosevelt"), 
                    aggr.by.var = "Party")
}