r - svm implementation using RTextTools -


This is the first time that I am using RTextTools. I have to implement a SVM classification on the collection of text documents. I am following this tutorial.

I am giving my code to you, step by step

First I read my data and gave an index file. The index file had a list of all the text documents, which had to be classified with their personal tags. For example, if there is a file, abc.txt, which is related to style X, the index file will be stored as abc.txt, X etc.

  data = read_data ('C: / user / del / dropbox / bundale / corpus / dublub / scalain / folder', type = c ('folder'), index = 'c: / User / del / dropbox / bundelli / corpus / whub / scalain / index.txt ')   

Second, I am creating a doctor-word matrix.

  doc_matrix & lt; - create_matrix (data, language = "english", removeNumbers = TRUE, stemWords = TRUE, SperseTerms = .8)   

Third, I create a container

  Container & lt; - create_container (doc_matrix, data $ style, trainSize = 1: 93, testSize = 94: 116, virgin = FALSE)   

Here, the data $ style is a label, where in each document its Style labels are in exact order, combinations like an index

But now, when I try to train SVM on the container using the following code,

 < Code> SVM & lt; - Train_model (Container, "SVM")   

This gives me this error: -

  Error in svm.default (x = container @ training_metrics, Y = container @ training_codes, x and y don.   

When I see the structure of the "container", it shows me the training code blank. In this way. < Pre: slot "training_codes": factor (0) level: slot "testcodes": factor (0) level:

If you wish, I'll give you the complete "commodity" Structure day I can eat it, but it can not be what can happen. Can someone please, please help? I am trying hard to see some answers. Can there be something wrong with read_data's index file, or This data is a problem with $ style variables? These are new things, I can put them wrong. I would be most grateful.

*** Resolved ****

checked, as suggested by @Thajja, < Code> Str (data) . After it has been changed:

  doc_matrix & lt; - create_matrix (data, language = "english", removeNumbers = TRUE, stemWords = TRUE, removeSparseTerms = .8)   

Also was changed:

  Container & lt; - create_container (doc_matrix, data $ style, trainSize = 1: 93, testSize = 94: 116, virgin = FALSE)    

To debug the problem, you are on the right track with the check of the container's structure.

Maybe use data $ text or something like this in making matrix steps, because it seems like the data is that it has a list of elements (as the container phase is made) .

Check the structure of the data using str (data) and pass the correct logic on the create_matrix () .

Comments

Popular posts from this blog

ios - Adding an SKSpriteNode to SKScene from a child SKSpriteNode -

Matlab transpose a table vector -

c# - Textbox not clickable but editable -