[NLP技术]关键词提取算法实现

实现代码:

var nodejieba = require("nodejieba");
var fs = require('fs');
var topN = 100;
var result;
var data = fs.readFileSync('t.txt', 'utf8');
console.log(data);
result = nodejieba.extract(data, topN);
console.log("11==>",result);

t.txt

据中国之声《新闻纵横》报道,在刚刚过去的中秋之夜,一颗“火流星”滑亮了云南省迪庆州的夜空。根据相关天文机构公布的信息,陨石坠落的地点,可能位于香格里拉市的巴拉格宗景区范围内。

事发一周之后,昨天(11日)下午,记者专访了巴拉格宗景区相关人员。对方称,目前还是没有确定陨石坠落的具体位置。最近,有很多人员都在当地寻找陨石,但至今没有任何消息。虽然陨石还没有找到,但在网上有关陨石归属的问题已经引发了讨论。

巴拉格宗景区的工作人员洛桑培楚说,事发当时,景区的多位工作人员都目睹了那颗“火流星”,“因为我们酒店的位置,刚好是在一个U字型的峡谷里,感觉突然间天空特别亮,有个东西就飞过来了,打在对面的崖壁上,过了几分钟之后,就听见咚的一声,附近村民有明显的震感。”

实现效果:

liuyugang:NodeJieBa apple$ node nodenlp.js
....
11==> [ { word: '陨石', weight: 45.6077707943 },
  { word: '格宗', weight: 35.21761292125063 },
  { word: '景区', weight: 32.27518069876 },
  { word: '巴拉', weight: 29.735080816230003 },
  { word: '火流星', weight: 24.582479479 },
  { word: '坠落', weight: 18.22637181838 },
  { word: '事发', weight: 16.80701885336 },
  { word: '工作人员', weight: 13.28734988976 },
  { word: '震感', weight: 12.5143832909 },
  { word: '迪庆', weight: 11.9547675029 },
  { word: '11', weight: 11.739204307083542 },
  { word: '培楚', weight: 11.739204307083542 },
  { word: '有个', weight: 11.739204307083542 },
  { word: '人员', weight: 11.18200151198 },
  { word: '新闻纵横', weight: 11.0103058941 },
  { word: '具体位置', weight: 10.8096351986 },
  { word: '飞过来', weight: 10.765183436 },
  { word: '香格里拉', weight: 10.642581114 },
  { word: '洛桑', weight: 10.2630914922 },
  { word: '字型', weight: 10.0088573539 },
  { word: '相关', weight: 9.67141986604 },
  { word: '崖壁', weight: 9.65218240993 },
  { word: '没有', weight: 9.338470695449999 },
  { word: '目睹', weight: 8.79473217808 },
  { word: '之后', weight: 8.7536825453 },
  { word: '夜空', weight: 8.75318317516 },
  { word: '之夜', weight: 8.65893063692 },
  { word: '中秋', weight: 8.55357012126 },
  { word: '那颗', weight: 8.5488195185 },
  { word: '几分钟', weight: 8.4980002701 },
  { word: '专访', weight: 8.35941410682 },
  { word: '多位', weight: 8.01735526349 },
  { word: '云南省', weight: 8.00903344015 },
  { word: '归属', weight: 8.00078029839 },
  { word: '刚好', weight: 7.90174109003 },
  { word: '之声', weight: 7.58531965045 },
  { word: '天文', weight: 7.45973111134 },
  { word: '峡谷', weight: 7.41757030052 },
  { word: '村民', weight: 7.28595205177 },
  { word: '酒店', weight: 7.19748953873 },
  { word: '对面', weight: 7.13679274341 },
  { word: '天空', weight: 6.90491149567 },
  { word: '一颗', weight: 6.84364067028 },
  { word: '地点', weight: 6.68250081357 },
  { word: '一周', weight: 6.6090214428 },
  { word: '讨论', weight: 6.28144423575 },
  { word: '引发', weight: 6.18600017817 },
  { word: '网上', weight: 6.15610784262 },
  { word: '寻找', weight: 6.04010686644 },
  { word: '下午', weight: 5.96939289045 },
  { word: '昨天', weight: 5.92683327603 },
  { word: '听见', weight: 5.92339566522 },
  { word: '报道', weight: 5.88040717916 },
  { word: '刚刚', weight: 5.78366356424 },
  { word: '最近', weight: 5.76738379075 },
  { word: '位置', weight: 5.67463922249 },
  { word: '找到', weight: 5.66161232021 },
  { word: '感觉', weight: 5.64147828931 },
  { word: '确定', weight: 5.35063012369 },
  { word: '信息', weight: 5.25386069277 },
  { word: '范围', weight: 5.19468393767 },
  { word: '附近', weight: 5.16934129144 },
  { word: '一声', weight: 5.15269025031 },
  { word: '公布', weight: 5.06198083963 },
  { word: '消息', weight: 5.03989475617 },
  { word: '突然', weight: 4.99713421631 },
  { word: '位于', weight: 4.96609078159 },
  { word: '很多', weight: 4.85828267085 },
  { word: '东西', weight: 4.77328420082 },
  { word: '过去', weight: 4.75519585235 },
  { word: '特别', weight: 4.74775455087 },
  { word: '当时', weight: 4.67584283385 },
  { word: '机构', weight: 4.65227107919 },
  { word: '明显', weight: 4.63964416568 },
  { word: '记者', weight: 4.29694475313 },
  { word: '问题', weight: 3.96351357308 },
  { word: '目前', weight: 3.91528758382 },
  { word: '可能', weight: 3.74802798573 },
  { word: '已经', weight: 3.42054864564 },
  { word: '中国', weight: 3.02732068666 },
  { word: '一个', weight: 2.81755097213 } ]
liuyugang:NodeJieBa apple$

源码地址

猜你喜欢

转载自blog.csdn.net/baihuaxiu123/article/details/78234531