STEP 1: Text preprocessing

POS tagging

When using the tm package, you don’t need to perform part-of-speech, or POS, tagging since this package has no functionalities that require POS tagging. However, there are other packages do to so: qdap and udpipe. For didactical reasons we will perform this on the non-spellchecked reviews.

qdap package

The qdap package uses the Penn Treebank. We will use the pos function:

p_load(qdap)

posdat <- pos(reviews)
posdat
  wrd.cnt       CC      CD        DT        IN        JJ     JJR     JJS      MD        NN      NNS      PRP    PRP$       RB    RBR    RBS       TO       VB     VBD     VBG     VBN     VBP      VBZ     WRB
1      71  3(4.2%) 2(2.8%) 10(14.1%)   5(7.0%)   6(8.5%) 1(1.4%) 1(1.4%) 1(1.4%)  9(12.7%)  4(5.6%)  7(9.9%)       0  5(7.0%)      0      0  2(2.8%)  3(4.2%) 1(1.4%) 1(1.4%) 2(2.8%) 3(4.2%)  5(7.0%)       0
2      23  2(8.7%)       0   1(4.3%)   2(8.7%)   1(4.3%) 1(4.3%)       0       0  4(17.4%)        0  1(4.3%) 1(4.3%) 4(17.4%)      0      0  1(4.3%)  1(4.3%) 1(4.3%) 1(4.3%)       0       0  2(8.7%)       0
3      98  5(5.1%)       0 12(12.2%) 13(13.3%) 12(12.2%) 1(1.0%)       0       0 23(23.5%)        0  5(5.1%) 3(3.1%)  8(8.2%)      0      0  3(3.1%)  2(2.0%) 2(2.0%) 2(2.0%) 1(1.0%) 1(1.0%)  4(4.1%) 1(1.0%)
4     336 19(5.7%)       0 36(10.7%) 50(14.9%)  30(8.9%)       0       0  3(.9%) 58(17.3%) 11(3.3%) 17(5.1%) 8(2.4%) 32(9.5%) 1(.3%) 1(.3%) 12(3.6%) 14(4.2%) 5(1.5%) 9(2.7%) 5(1.5%) 9(2.7%) 13(3.9%)  3(.9%)
5      13  1(7.7%)       0   1(7.7%)   1(7.7%)  3(23.1%)       0       0       0  3(23.1%)        0        0 1(7.7%)        0      0      0  1(7.7%)  1(7.7%)       0       0       0       0  1(7.7%)       0

In the output above you can see how many words there are in each document, together with the distribution per word type. If you want to know what each tag stands for, you can use the following function.

pos_tags()
   Tag  Description                             
1  CC   Coordinating conjunction                          19 PRP$ Possessive pronoun  
2  CD   Cardinal number                                   20 RB   Adverb 
3  DT   Determiner                                        21 RBR  Adverb, comparative 
4  EX   Existential there                                 22 RBS  Adverb, superlative  
5  FW   Foreign word                                      23 RP   Particle 
6  IN   Preposition or subordinating conjunction          24 SYM  Symbol   
7  JJ   Adjective                                         25 TO   to 
8  JJR  Adjective, comparative                            26 UH   Interjection
9  JJS  Adjective, superlative                            27 VB   Verb, base form  
10 LS   List item marker                                  28 VBD  Verb, past tense 
11 MD   Modal                                             29 VBG  Verb, gerund or present participle  
12 NN   Noun, singular or mass                            30 VBN  Verb, past participle   
13 NNS  Noun, plural                                      31 VBP  Verb, non-3rd person singular present  
14 NNP  Proper noun, singular                             32 VBZ  Verb, 3rd person singular present 
15 NNPS Proper noun, plural                               33 WDT  Wh-determiner 
16 PDT  Predeterminer                                     34 WP   Wh-pronoun 
17 POS  Possessive ending                                 35 WP$  Possessive wh-pronoun
18 PRP  Personal pronoun                                  36 WRB  Wh-adverb

Let’s look at some popular methods.


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 POStagged
1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                two/CD monthlong/JJ trips/NNS abroad/RB this/DT is/VBZ the/DT best/JJS it/PRP take/VB a/DT little/JJ while/NN to/TO get/VB used/VBN to/TO the/DT smaller/JJR keyboard/NN but/CC once/RB you/PRP do/VBP it/PRP works/VBZ flawlessly/RB the/DT charge/NN lasts/VBZ a/DT very/RB long/JJ time/NN months/NNS they/PRP say/VBP no/DT problem/NN not/RB recharging/VBG it/PRP for/IN weeks/NNS of/IN constant/JJ use/NN solid/JJ looks/VBZ good/JJ and/CC protects/VBZ the/DT ipad/NN i/NN couldnt/MD survive/VB without/IN it/PRP i/IN havent/PRP tried/VBD any/DT others/NNS but/CC i/NN am/VBP sold/VBN on/IN this/DT one/CD
2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         this/DT is/VBZ nearly/RB as/RB heavy/JJ as/IN my/PRP$ laptop/NN and/CC i/NN was/VBD hoping/VBG to/TO find/VB something/NN lighter/JJR for/IN travel/NN but/CC it/PRP works/VBZ well/RB anyway/RB
3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                wonderfully/RB thin/JJ light/NN and/CC durable/JJ the/DT keyboard/NN works/VBZ extremely/RB well/RB for/IN me/PRP my/PRP$ only/RB wish/VB about/IN this/DT is/VBZ that/IN the/DT angle/NN was/VBD not/RB quite/RB so/RB steep/JJ when/WRB open/JJ or/CC perhaps/RB adjustable/JJ if/IN i/PRP hold/VBP it/PRP on/IN my/PRP$ lap/NN with/IN the/DT front/JJ edge/NN of/IN the/DT keyboard/NN at/IN my/PRP$ navel/NN that/IN tilts/VBZ it/PRP to/TO an/DT acceptable/JJ angle/NN if/IN using/VBG it/PRP on/IN a/DT keyboard/NN tray/NN or/CC table/NN i/NN put/VBD a/DT postit/NN pad/NN or/CC similar/JJ item/NN under/IN the/DT front/JJ edge/NN to/TO put/VB the/DT view/NN angle/NN to/TO something/NN more/JJR sensible/NN but/CC overall/JJ this/DT is/VBZ much/JJ nicer/NN vs/IN the/DT keyboard/NN cover/NN id/JJ been/VBN using/VBG
4 this/DT keyboardcase/NN cover/NN is/VBZ absolutely/RB fabulous/JJ it/PRP works/VBZ so/RB well/RB and/CC it/PRP so/RB convenient/JJ and/CC stylish/JJ im/NN the/DT envy/NN of/IN all/DT of/IN my/PRP$ friends/NNS some/DT have/VBP even/RB mistaken/VBN the/DT combo/NN of/IN my/PRP$ ipad/NN and/CC this/DT keyboard/NN for/IN a/DT netbook/NN and/CC are/VBP amazed/VBN when/WRB im/PRP able/JJ to/TO so/RB easily/RB and/CC quickly/RB change/VB postions/NNS from/IN horizontal/JJ to/TO vertical/JJ and/CC then/RB snap/VB on/IN the/DT magnets/NNS to/TO coverclose/VB my/PRP$ ipad/NN and/CC on/IN the/DT go/NN in/IN seconds/NNS i/VBP also/RB love/VB how/WRB the/DT outer/JJ aluminum/NN casing/NN matching/VBG that/DT of/IN the/DT ipad/NN and/CC even/RB if/IN you/PRP add/VBP a/DT skin/NN like/IN i/NN did/VBD it/PRP still/RB looks/VBZ very/RB professional/JJ and/CC classysleek/JJ with/IN apples/NNS original/JJ design/NN im/IN a/DT college/NN student/NN and/CC this/DT very/RB versatile/JJ product/NN has/VBZ been/VBN really/RB amazingly/RB helpful/JJ easy/JJ to/TO type/NN on/IN and/CC efficient/JJ with/IN taking/VBG notes/NNS in/IN class/NN and/CC in/IN helping/VBG me/PRP to/TO use/VB my/PRP$ ipad/NN as/IN a/DT netbook/NN at/IN times/NNS and/CC also/RB to/TO detached/JJ and/CC use/NN is/VBZ solo/RB as/IN it/PRP was/VBD originally/RB intended/VBN very/RB versatile/JJ amazing/JJ product/NN i/NNS have/VBP highly/RB recommended/VBN this/DT to/TO others/NNS though/IN it/PRP does/VBZ most/RBS of/IN the/DT selling/VBG itself/PRP ps/VBZ my/PRP$ only/JJ con/NN is/VBZ that/IN the/DT ipad/NN isnt/NN as/IN secure/JJ in/IN this/DT keyboarddevice/NN in/IN the/DT vertical/JJ position/NN as/IN it/PRP is/VBZ horiztonally/RB it/PRP seems/VBZ that/IN it/PRP only/RB clicks/VBZ into/IN postion/NN with/IN a/DT locking/VBG secure/JJ feel/NN in/IN the/DT horizontal/JJ position/NN if/IN they/PRP were/VBD to/TO maybe/RB add/VB some/DT center/NN magnets/NNS to/TO the/DT keyboard/NN docking/NN areaslot/NN i/IN think/VBP it/PRP would/MD lock/VB into/IN place/NN and/CC function/NN more/RBR securely/RB since/IN being/VBG vertical/JJ makes/VBZ the/DT ipad/NN too/RB top/JJ heavy/NN for/IN the/DT keyboard/NN additionally/RB my/PRP$ keyboard/NN did/VBD actually/RB fall/VB outback/RB and/CC down/RB from/IN the/DT keyboard/NN docking/NN area/NN when/WRB in/IN vertical/JJ position/NN before/IN but/CC i/IN think/VBP it/PRP had/VBD to/TO do/VB with/IN the/DT pressing/VBG of/IN my/PRP$ fingers/NNS on/IN the/DT ipad/NN while/IN in/IN vertical/JJ position/NN not/RB a/DT problem/NN in/IN horizontal/JJ position/NN if/IN you/PRP are/VBP going/VBG to/TO touch/VB your/PRP$ screen/NN a/DT lot/NN and/CC type/NN i/NN would/MD suggest/VB the/DT horizontal/JJ position/NN but/CC if/IN you/PRP just/RB plan/VBP to/TO type/VB without/IN the/DT touching/VBG on/IN the/DT screen/NN i/VBZ think/VB the/DT vertical/JJ position/NN should/MD work/VB fine/JJ
5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   great/JJ case/NN easy/JJ to/TO use/VB thin/JJ and/CC turns/VBZ my/PRP$ ipad/NN into/IN a/DT macbook/NN
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   POStags word.count
1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      CD, JJ, NNS, RB, DT, VBZ, DT, JJS, PRP, VB, DT, JJ, NN, TO, VB, VBN, TO, DT, JJR, NN, CC, RB, PRP, VBP, PRP, VBZ, RB, DT, NN, VBZ, DT, RB, JJ, NN, NNS, PRP, VBP, DT, NN, RB, VBG, PRP, IN, NNS, IN, JJ, NN, JJ, VBZ, JJ, CC, VBZ, DT, NN, NN, MD, VB, IN, PRP, IN, PRP, VBD, DT, NNS, CC, NN, VBP, VBN, IN, DT, CD         71
2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       DT, VBZ, RB, RB, JJ, IN, PRP$, NN, CC, NN, VBD, VBG, TO, VB, NN, JJR, IN, NN, CC, PRP, VBZ, RB, RB         23
3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            RB, JJ, NN, CC, JJ, DT, NN, VBZ, RB, RB, IN, PRP, PRP$, RB, VB, IN, DT, VBZ, IN, DT, NN, VBD, RB, RB, RB, JJ, WRB, JJ, CC, RB, JJ, IN, PRP, VBP, PRP, IN, PRP$, NN, IN, DT, JJ, NN, IN, DT, NN, IN, PRP$, NN, IN, VBZ, PRP, TO, DT, JJ, NN, IN, VBG, PRP, IN, DT, NN, NN, CC, NN, NN, VBD, DT, NN, NN, CC, JJ, NN, IN, DT, JJ, NN, TO, VB, DT, NN, NN, TO, NN, JJR, NN, CC, JJ, DT, VBZ, JJ, NN, IN, DT, NN, NN, JJ, VBN, VBG         98
4 DT, NN, NN, VBZ, RB, JJ, PRP, VBZ, RB, RB, CC, PRP, RB, JJ, CC, JJ, NN, DT, NN, IN, DT, IN, PRP$, NNS, DT, VBP, RB, VBN, DT, NN, IN, PRP$, NN, CC, DT, NN, IN, DT, NN, CC, VBP, VBN, WRB, PRP, JJ, TO, RB, RB, CC, RB, VB, NNS, IN, JJ, TO, JJ, CC, RB, VB, IN, DT, NNS, TO, VB, PRP$, NN, CC, IN, DT, NN, IN, NNS, VBP, RB, VB, WRB, DT, JJ, NN, NN, VBG, DT, IN, DT, NN, CC, RB, IN, PRP, VBP, DT, NN, IN, NN, VBD, PRP, RB, VBZ, RB, JJ, CC, JJ, IN, NNS, JJ, NN, IN, DT, NN, NN, CC, DT, RB, JJ, NN, VBZ, VBN, RB, RB, JJ, JJ, TO, NN, IN, CC, JJ, IN, VBG, NNS, IN, NN, CC, IN, VBG, PRP, TO, VB, PRP$, NN, IN, DT, NN, IN, NNS, CC, RB, TO, JJ, CC, NN, VBZ, RB, IN, PRP, VBD, RB, VBN, RB, JJ, JJ, NN, NNS, VBP, RB, VBN, DT, TO, NNS, IN, PRP, VBZ, RBS, IN, DT, VBG, PRP, VBZ, PRP$, JJ, NN, VBZ, IN, DT, NN, NN, IN, JJ, IN, DT, NN, IN, DT, JJ, NN, IN, PRP, VBZ, RB, PRP, VBZ, IN, PRP, RB, VBZ, IN, NN, IN, DT, VBG, JJ, NN, IN, DT, JJ, NN, IN, PRP, VBD, TO, RB, VB, DT, NN, NNS, TO, DT, NN, NN, NN, IN, VBP, PRP, MD, VB, IN, NN, CC, NN, RBR, RB, IN, VBG, JJ, VBZ, DT, NN, RB, JJ, NN, IN, DT, NN, RB, PRP$, NN, VBD, RB, VB, RB, CC, RB, IN, DT, NN, NN, NN, WRB, IN, JJ, NN, IN, CC, IN, VBP, PRP, VBD, TO, VB, IN, DT, VBG, IN, PRP$, NNS, IN, DT, NN, IN, IN, JJ, NN, RB, DT, NN, IN, JJ, NN, IN, PRP, VBP, VBG, TO, VB, PRP$, NN, DT, NN, CC, NN, NN, MD, VB, DT, JJ, NN, CC, IN, PRP, RB, VBP, TO, VB, IN, DT, VBG, IN, DT, NN, VBZ, VB, DT, JJ, NN, MD, VB, JJ        336
5 
  wrd.cnt CC CD DT IN JJ JJR JJS MD NN NNS PRP PRP$ RB RBR RBS TO VB VBD VBG VBN VBP VBZ WRB
1      71  3  2 10  5  6   1   1  1  9   4   7    0  5   0   0  2  3   1   1   2   3   5   0
2      23  2  0  1  2  1   1   0  0  4   0   1    1  4   0   0  1  1   1   1   0   0   2   0
3      98  5  0 12 13 12   1   0  0 23   0   5    3  8   0   0  3  2   2   2   1   1   4   1
4     336 19  0 36 50 30   0   0  3 58  11  17    8 32   1   1 12 14   5   9   5   9  13   3
5      13  1  0  1  1  3   0   0  0  3   0   0    1  0   0   0  1  1   0   0   0   0   1   0
  wrd.cnt         CC         CD         DT         IN         JJ        JJR        JJS          MD        NN        NNS        PRP       PRP$         RB        RBR        RBS         TO         VB        VBD        VBG        VBN        VBP        VBZ         WRB
1      71 0.04225352 0.02816901 0.14084507 0.07042254 0.08450704 0.01408451 0.01408451 0.014084507 0.1267606 0.05633803 0.09859155 0.00000000 0.07042254 0.00000000 0.00000000 0.02816901 0.04225352 0.01408451 0.01408451 0.02816901 0.04225352 0.07042254 0.000000000
2      23 0.08695652 0.00000000 0.04347826 0.08695652 0.04347826 0.04347826 0.00000000 0.000000000 0.1739130 0.00000000 0.04347826 0.04347826 0.17391304 0.00000000 0.00000000 0.04347826 0.04347826 0.04347826 0.04347826 0.00000000 0.00000000 0.08695652 0.000000000
3      98 0.05102041 0.00000000 0.12244898 0.13265306 0.12244898 0.01020408 0.00000000 0.000000000 0.2346939 0.00000000 0.05102041 0.03061224 0.08163265 0.00000000 0.00000000 0.03061224 0.02040816 0.02040816 0.02040816 0.01020408 0.01020408 0.04081633 0.010204082
4     336 0.05654762 0.00000000 0.10714286 0.14880952 0.08928571 0.00000000 0.00000000 0.008928571 0.1726190 0.03273810 0.05059524 0.02380952 0.09523810 0.00297619 0.00297619 0.03571429 0.04166667 0.01488095 0.02678571 0.01488095 0.02678571 0.03869048 0.008928571
5      13 0.07692308 0.00000000 0.07692308 0.07692308 0.23076923 0.00000000 0.00000000 0.000000000 0.2307692 0.00000000 0.00000000 0.07692308 0.00000000 0.00000000 0.00000000 0.07692308 0.07692308 0.00000000 0.00000000 0.00000000 0.00000000 0.07692308 0.000000000

The combination of the latter two functions show the information that is given by the pos function. On top of this information, it is also possible to make plots.

plot(preprocessed(posdat))

posdat

udpipe package

The udpipe package contains the Universal dependencies database pos and the treebank pos. The advantage is that this works with multiple languages (including Dutch). This implies that you should first download the model for a specific language.

p_load(udpipe)

udmodel <- udpipe_download_model(language = "english")

Next, you can run the function to tokenize, POS tag, lemmatize and dependency tag the data.

parsed <- udpipe(reviews, object = udmodel)
head(parsed)
  doc_id paragraph_id sentence_id
1   doc1            1           1
2   doc1            1           1
3   doc1            1           1
4   doc1            1           1
5   doc1            1           1
6   doc1            1           1
                                                                                                                                                                                                                                                                                                                                                                     sentence
1 two monthlong trips abroad this is the best it take a little while to get used to the smaller keyboard but once you do it works flawlessly the charge lasts a very long time months they say no problem not recharging it for weeks of constant use solid looks good and protects the ipad i couldnt survive without it i havent tried any others but i am sold on this one
2 two monthlong trips abroad this is the best it take a little while to get used to the smaller keyboard but once you do it works flawlessly the charge lasts a very long time months they say no problem not recharging it for weeks of constant use solid looks good and protects the ipad i couldnt survive without it i havent tried any others but i am sold on this one
3 two monthlong trips abroad this is the best it take a little while to get used to the smaller keyboard but once you do it works flawlessly the charge lasts a very long time months they say no problem not recharging it for weeks of constant use solid looks good and protects the ipad i couldnt survive without it i havent tried any others but i am sold on this one
4 two monthlong trips abroad this is the best it take a little while to get used to the smaller keyboard but once you do it works flawlessly the charge lasts a very long time months they say no problem not recharging it for weeks of constant use solid looks good and protects the ipad i couldnt survive without it i havent tried any others but i am sold on this one
5 two monthlong trips abroad this is the best it take a little while to get used to the smaller keyboard but once you do it works flawlessly the charge lasts a very long time months they say no problem not recharging it for weeks of constant use solid looks good and protects the ipad i couldnt survive without it i havent tried any others but i am sold on this one
6 two monthlong trips abroad this is the best it take a little while to get used to the smaller keyboard but once you do it works flawlessly the charge lasts a very long time months they say no problem not recharging it for weeks of constant use solid looks good and protects the ipad i couldnt survive without it i havent tried any others but i am sold on this one
  start end term_id token_id     token     lemma upos xpos                                                 feats head_token_id dep_rel  deps  misc
1     1   3       1        1       two       two  NUM   CD                                          NumType=Card             3  nummod  <NA>  <NA>
2     5  13       2        2 monthlong monthlong  ADJ   JJ                                            Degree=Pos             3    amod  <NA>  <NA>
3    15  19       3        3     trips      trip NOUN  NNS                                           Number=Plur             8   nsubj  <NA>  <NA>
4    21  26       4        4    abroad    abroad  ADV   RB                                                  <NA>             5  advmod  <NA>  <NA>
5    28  31       5        5      this      this PRON   DT                              Number=Sing|PronType=Dem             8   nsubj  <NA>  <NA>
6    33  34       6        6        is        be  AUX  VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin             8     cop  <NA>  <NA>

Let’s have a look at the data for document 2. The pos-tags are upos (universal dependencies) and xpos (treebank).

head(parsed %>% filter(doc_id == 'doc2'))
  doc_id paragraph_id sentence_id                                                                                                            sentence
1   doc2            1           1 this is nearly as heavy as my laptop and i was hoping to find something lighter for travel but it works well anyway
2   doc2            1           1 this is nearly as heavy as my laptop and i was hoping to find something lighter for travel but it works well anyway
3   doc2            1           1 this is nearly as heavy as my laptop and i was hoping to find something lighter for travel but it works well anyway
4   doc2            1           1 this is nearly as heavy as my laptop and i was hoping to find something lighter for travel but it works well anyway
5   doc2            1           1 this is nearly as heavy as my laptop and i was hoping to find something lighter for travel but it works well anyway
6   doc2            1           1 this is nearly as heavy as my laptop and i was hoping to find something lighter for travel but it works well anyway 
  start end   term_id token_id  token  lemma upos xpos                                                 feats head_token_id dep_rel deps misc
1     1   4         1        1   this   this PRON   DT                              Number=Sing|PronType=Dem             5   nsubj <NA> <NA>
2     6   7         2        2     is     be  AUX  VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin             5     cop <NA> <NA>
3     9  14         3        3 nearly nearly  ADV   RB                                                  <NA>             5  advmod <NA> <NA>
4    16  17         4        4     as     as  ADV   RB                                                  <NA>             5  advmod <NA> <NA>
5    19  23         5        5  heavy  heavy  ADJ   JJ                                            Degree=Pos             0    root <NA> <NA>
6    25  26         6        6     as     as  ADP   IN                                                  <NA>             8    case <NA> <NA>
textplot_dependencyparser(parsed %>% filter(doc_id == 'doc1'))

doc2

Multiple choice

Which of the following statements is correct according to the output that is shown above?

  1. The variable reviews contains 5 documents of which the fourth document contains the highest number of words.
  2. The word "monthlong" is positioned on index 5 until index 13 in the second document.
  3. According to the plot, the class of the word "two" is common across all documents.

To download the productreviews dataset click here1.