r - svm implementation using RTextTools -
This is the first time that I am using RTextTools. I have to implement a SVM classification on the collection of text documents. I am following this tutorial.
I am giving my code to you, step by step
First I read my data and gave an index file. The index file had a list of all the text documents, which had to be classified with their personal tags. For example, if there is a file, abc.txt, which is related to style X, the index file will be stored as abc.txt, X etc.
data = read_data ('C: / user / del / dropbox / bundale / corpus / dublub / scalain / folder', type = c ('folder'), index = 'c: / User / del / dropbox / bundelli / corpus / whub / scalain / index.txt ') Second, I am creating a doctor-word matrix.
doc_matrix & lt; - create_matrix (data, language = "english", removeNumbers = TRUE, stemWords = TRUE, SperseTerms = .8) Third, I create a container
Container & lt; - create_container (doc_matrix, data $ style, trainSize = 1: 93, testSize = 94: 116, virgin = FALSE) Here, the data $ style is a label, where in each document its Style labels are in exact order, combinations like an index
But now, when I try to train SVM on the container using the following code,
< Code> SVM & lt; - Train_model (Container, "SVM") This gives me this error: -
Error in svm.default (x = container @ training_metrics, Y = container @ training_codes, x and y don. When I see the structure of the "container", it shows me the training code blank. In this way. < Pre: slot "training_codes": factor (0) level: slot "testcodes": factor (0) level: If you wish, I'll give you the complete "commodity" Structure day I can eat it, but it can not be what can happen. Can someone please, please help? I am trying hard to see some answers. Can there be something wrong with read_data's index file, or This data is a problem with $ style variables? These are new things, I can put them wrong. I would be most grateful.
*** Resolved ****
checked, as suggested by @Thajja, < Code> Str (data) . After it has been changed:
doc_matrix & lt; - create_matrix (data, language = "english", removeNumbers = TRUE, stemWords = TRUE, removeSparseTerms = .8) Also was changed:
Container & lt; - create_container (doc_matrix, data $ style, trainSize = 1: 93, testSize = 94: 116, virgin = FALSE)
To debug the problem, you are on the right track with the check of the container's structure.
Maybe use data $ text or something like this in making matrix steps, because it seems like the data is that it has a list of elements (as the container phase is made) .
Check the structure of the data using str (data) and pass the correct logic on the create_matrix () .
Comments
Post a Comment