nlp - Create wordforms using python -
How do I get different word forms using Python? I want to make a list like the following.
function = ['work', 'work', 'works'] my code:
raw = Nltk.clean_html (html) clear = re.sub (r '& amp; (ld | rd) quo? [; \]]', '\ "', Raw) token = Nltk.wordpunct_tokenize (clear) stemmer = PorterStemmer (T) [words for T in TOMMER and T in T] [tmemer.stem (t) text = nltk.Text (t) word = word (n) word = [stem word (e) for word ( E) appears in the word '' '' 'search_words =' set.split ('') '' sents = '' .join ([s.lower (for text)]] Blob = Textblob (sents.decode ('ascii', 'ignore')) match = [map (str, blob.sentences [i-1: i + 2]) # previous For the next, I, n (ni) (blob.sentences) is the # i index, e is the element if search_words & sets (s.words)] # return list (itertools.chain ('.') Replace ('& amp; rdquo', '') for matches in place of ('& amp; rsquo', '')) Return List (Iterols chains (* matches))
It was a bit tricky. I tried to see the hard forms in the text and then mapped it with the list of words. Apart from this, I changed it to a lesser case because the immobilization did not do this and compared to it it was completely mapped. Below is the updated code
raw = nltk.clean_html (html) clear = re.sub (r '& amp;? (Ld | rd) quo? [; \]]', '\' ', Raw) token = nltk.wordpunct_tokenize (clear) for lower = token in w [w.lower ()] stemmer = porterStemmer () T = [stemmer.stem (t) if the word for t in t And t in less t = words = nltk.Text (t) word = words (n) word = [stemmer.stem (e) for e in the word = '' ) Search_words = set (get .split ('')) sents = ''. ([S.lower () for text s] blob = textblob (sents.decode ('ascii', 'ignore')) match = [Map (str, Blob.sentences [i-1: i + 2]) # Previous For the next, for the next, the blob.sentences is in the # i index, the e is the element if search_words & amp; set (s.words)] # return list (iTrotals.chain ('' .joint (str ( Y) .replace ('& rdquo', ''). Change the match ('& amp; rsquo', '') for matches) Return list (itertools. Chain (* matches))
Comments
Post a Comment