Thursday, October 12, 2017
corenlp
step 1: tokenize
java -cp "stanford-corenlp-full-2017-06-09/*" edu.stanford.nlp.process.PTBTokenizer 2-malware.txt > 2-malware.tok
less 2-malware.tok
Step 2: mark:
perl -ne 'chomp; print "$_\tO\n"' 2-malware.tok > 2-malware.tsv
Step 3: generate tsv file using customized KEYWORD file
generate final.tsv
Step 4: generate customized NER using malware.prop: => will create malware-ner-model.ser.gz
java -cp "stanford-corenlp-full-2017-06-09/*" edu.stanford.nlp.ie.crf.CRFClassifier -prop malware.prop
Step 5: using custom NER to detect text
java -cp "stanford-corenlp-full-2017-06-09/*" edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier malware-ner-model.ser.gz -testFile sample.tsv.ini
--------------------------------------
java -cp "stanford-corenlp-full-2017-06-09/*" edu.stanford.nlp.process.PTBTokenizer jane-austen-emma-ch1.txt > jane-austen-emma-ch1.tok
java -mx15g -cp "stanford-corenlp-full-2017-06-09/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -outputFormat json -file jane-austen-emma-ch1.txt
Subscribe to:
Posts (Atom)
-
Step 1, New a project rails new demo Step 2, Update Gemfile add paperclip, mysql2 gem, enable JavaScript runtime gem 'mysql2' ...
-
I used 7z to zip this file under Windows, try to unzip it under linux [ang@walker temp]$ gunzip 2011.sdf.zip gunzip: 2011.sdf.zip: unkno...
-
When trying to access transmission from web-browswer i got the message : 403: Forbidden Unauthorized IP Address. Either disable the IP ad...