Dimensionality reduction combat! Why do I think data structures and algorithms are important for front-end development

 

A thing from GitHub issue on the talk: HTTP S: // github.com/LeuisK EN / leuisken.github.io / AND DELINQUENCY / 2, I am referring to the demand which took the  issue inside of me .

From a demand

Before the project I once encountered such a demand, the preparation of a cascade selector, something like this:

 

11

 

Example of FIG using Ant-Design of Cascader assembly.

To accomplish this, I need a data structure like this:

var data = [{
  "value": "浙江",
  "children": [{
    "value": "杭州",
    "children": [{
      "value": "西湖"
    }]
  }]
}, {
  "value": "四川",
  "children": [{
    "value": "成都",
    "children": [{
      "value": "锦里"
    }, {
      "value": "方所"
    }]
  }, {
    "value": "阿坝",
    "children": [{
      "value": "九寨沟"
    }]
  }]
}]

Data has a hierarchical structure, very easy to implement this feature, because the structure and components of this structure is consistent, recursively traverse it.

However, due to the back-end is usually a relational database, the data returned is usually something like this:

var data = [{
  "province": "浙江",
  "city": "杭州",
  "name": "西湖"
}, {
  "province": "四川",
  "city": "成都",
  "name": "锦里"
}, {
  "province": "四川",
  "city": "成都",
  "name": "方所"
}, {
  "province": "四川",
  "city": "阿坝",
  "name": "九寨沟"
}]

The front side wants to convert the data to look at actually not difficult, because you want to merge duplicate entries, you can reference data deduplication method to do, so I wrote this version.

'use strict'

/**
 * 将一个没有层级的扁平对象,转换为树形结构({value, children})结构的对象
 * @param {array} tableData - 一个由对象构成的数组,里面的对象都是扁平的
 * @param {array} route - 一个由字符串构成的数组,字符串为前一数组中对象的key,最终
 * 输出的对象层级顺序为keys中字符串key的顺序
 * @return {array} 保存具有树形结构的对象
 */

var transObject = function(tableData, keys) {
  let hashTable = {}, res = []
  for( let i = 0; i < tableData.length; i++ ) {
    if(!hashTable[tableData[i][keys[0]]]) {
      let len = res.push({
        value: tableData[i][keys[0]],
        children: []
      })
      // 在这里要保存key对应的数组序号,不然还要涉及到查找
      hashTable[tableData[i][keys[0]]] = { $$pos: len - 1 }
    }
    if(!hashTable[tableData[i][keys[0]]][tableData[i][keys[1]]]) {
      let len = res[hashTable[tableData[i][keys[0]]].$$pos].children.push({
        value: tableData[i][keys[1]],
        children: []
      })
      hashTable[tableData[i][keys[0]]][tableData[i][keys[1]]] = { $$pos: len - 1 }
    }
    res[hashTable[tableData[i][keys[0]]].$$pos].children[hashTable[tableData[i][keys[0]]][tableData[i][keys[1]]].$$pos].children.push({
      value: tableData[i][keys[2]]
    })
  }
  return res
}

var data = [{
  "province": "浙江",
  "city": "杭州",
  "name": "西湖"
}, {
  "province": "四川",
  "city": "成都",
  "name": "锦里"
}, {
  "province": "四川",
  "city": "成都",
  "name": "方所"
}, {
  "province": "四川",
  "city": "阿坝",
  "name": "九寨沟"
}]

var keys = ['province', 'city', 'name']

console.log(transObject(data, keys))

Fortunately, the length of only three keys, this kind of thing simply no way to write long, it can be seen clearly that there are duplicate section, you can get through the cycle, but do not think for a long time thinking, shelved.

Then, one day after dinner is not very busy, just next to the colleagues chatted for data that needs to ask how to use loops to process. He looked at me and asked: "Do you know the trie it?." I first heard the concept, he simply told me a moment, then said, feeling somewhat similar deal with the problem, so that I can look at the principles of trie tree and try to optimize it.

Reason, trie tree data structure does have a lot of information online, but few use JavaScript to achieve, but the principle has not been difficult. After trying, I will transObjectcode optimization become so. (On the trie, also please the reader to read the related materials)

var transObject = function(tableData, keys) {
  let hashTable = {}, res = []
  for (let i = 0; i < tableData.length; i++) {
    let arr = res, cur = hashTable
    for (let j = 0; j < keys.length; j++) {
      let key = keys[j], filed = tableData[i][key]
      if (!cur[filed]) {
        let pusher = {
          value: filed
        }, tmp
        if (j !== (keys.length - 1)) {
          tmp = []
          pusher.children = tmp
        }
        cur[filed] = { $$pos: arr.push(pusher) - 1 }
        cur = cur[filed]
        arr = tmp
      } else {
        cur = cur[filed]
        arr = arr[cur.$$pos].children
      }
    }
  }
  return res
}

In this way, the solution to and independent of the length of the keys.

As this solution inside "three-body" use "dichroic foil" to reduce the dimension of the universe civilization general strike neat!

If you do not understand the concepts "Trie" tree, you can continue to see learning to read down.

This is written before an old article, Xiao Wu made some changes here and typesetting

Trie tree

Trie takes its name from "retrieval", retrieval, because you can only use a prefix Trie will be able to find the word you want in a dictionary.
While agreeing with the pronunciation "Tree", but for such a dictionary with common binary tree to show the difference, Xiao Wu, general programmer read "Trie" reread the tail it will soon be understood as read "TreeE."

Trie tree, also called "dictionary tree." As the name suggests, it is a tree structure . It is a specialized processing data structures that match the string used to solve quickly find a string in a set of strings collection problems.

In addition Trie tree, also known as prefix tree (because of the common prefix descendants of a node, such as the pan is a prefix of the panda).

Its key are strings, and the insertion can be done efficiently query time complexity of O (k), k is a string, the disadvantage is very large memory consumption if there is no common prefix string.

Its core idea is that by minimizing unnecessary string comparisons, so that the query efficiency, that is, "with space for time", re-use common prefixes to improve query efficiency.

Features of the Trie

Suppose there are five strings, which are: code, cook, five, file, fat. Which now need to find out if a string exists several times. If every look, are taking the string you're looking for with these five strings string matching in turn, that efficiency is relatively low, there is no more efficient way to do that?

If these are organized into five strings under the structure map, scanning from the naked eye in the past is not the senses will be more rapidly than find them.

Trie tree looksTrie tree looks

By the figure, it can be found in the Trie three characteristics:

  • The root node does not include characters, each node except the root node contains only one character
  • String from the root node to a node on the path through the connected characters, corresponding to that node
  • All the characters in each node contains child nodes are not the same

通过动画理解 Trie 树构造的过程。在构造过程中的每一步,都相当于往 Trie 树中插入一个字符串。当所有字符串都插入完成之后,Trie 树就构造好了。

 Trie tree structure Trie 树构造

Trie树的插入操作

Trie tree insertTrie树的插入操作

Trie树的插入操作很简单,其实就是将单词的每个字母逐一插入 Trie树。插入前先看字母对应的节点是否存在,存在则共享该节点,不存在则创建对应的节点。比如要插入新单词cook,就有下面几步:

  • 插入第一个字母 c,发现 root 节点下方存在子节点 c,则共享节点 c
  • 插入第二个字母 o,发现 c 节点下方存在子节点 o,则共享节点 o
  • 插入第三个字母 o,发现 o 节点下方不存在子节点 o,则创建子节点 o
  • 插入第三个字母 k,发现 o 节点下方不存在子节点 k,则创建子节点 k
  • 至此,单词 cook 中所有字母已被插入 Trie树 中,然后设置节点 k 中的标志位,标记路径 root-&gt;c-&gt;o-&gt;o-&gt;k这条路径上所有节点的字符可以组成一个单词cook

Trie树的查询操作

在 Trie 树中查找一个字符串的时候,比如查找字符串 code,可以将要查找的字符串分割成单个的字符 c,o,d,e,然后从 Trie 树的根节点开始匹配。如图所示,绿色的路径就是在 Trie 树中匹配的路径。

code matching pathcode的匹配路径

如果要查找的是字符串cod(鳕鱼)呢?还是可以用上面同样的方法,从根节点开始,沿着某条路径来匹配,如图所示,绿色的路径,是字符串cod匹配的路径。但是,路径的最后一个节点「d」并不是橙色的,并不是单词标志位,所以cod字符串不存在。也就是说,cod是某个字符串的前缀子串,但并不能完全匹配任何字符串。

cod matching pathcod的匹配路径

Trie树的删除操作

Trie树的删除操作与二叉树的删除操作有类似的地方,需要考虑删除的节点所处的位置,这里分三种情况进行分析:

删除整个单词(比如 hi )

Delete the entire word删除整个单词

  • 从根节点开始查找第一个字符h
  • 找到h子节点后,继续查找h的下一个子节点i
  • i是单词hi的标志位,将该标志位去掉
  • i节点是hi的叶子节点,将其删除
  • 删除后发现h节点为叶子节点,并且不是单词标志位,也将其删除
  • 这样就完成了hi单词的删除操作

删除前缀单词(比如 cod )

Delete the word prefix删除前缀单词
这种方式删除比较简单。
只需要将cod单词整个字符串查找完后,d节点因为不是叶子节点,只需将其单词标志去掉即可。

 

删除分支单词(比如 cook )

Delete the word branch删除分支单词
删除整个单词 情况类似,区别点在于删除到 cook 的第一个 o 时,该节点为非叶子节点,停止删除,这样就完成cook字符串的删除操作。

 

Trie树的应用

事实上 Trie树 在日常生活中的使用随处可见,比如这个:

具体来说就是经常用于统计和排序大量的字符串(但不仅限于字符串),所以经常被搜索引擎系统用于文本词频统计。它的优点是:最大限度地减少无谓的字符串比较,查询效率比哈希表高。

1. 前缀匹配

例如:找出一个字符串集合中所有以 五分钟 开头的字符串。我们只需要用所有字符串构造一个 trie树,然后输出以 五−>分−>钟 开头的路径上的关键字即可。

trie树前缀匹配常用于搜索提示。如当输入一个网址,可以自动搜索出可能的选择。当没有完全匹配的搜索结果,可以返回前缀最相似的可能

google searchgoogle搜索

2. 字符串检索

给出 N 个单词组成的熟词表,以及一篇全用小写英文书写的文章,按最早出现的顺序写出所有不在熟词表中的生词。

检索/查询功能是Trie树最原始的功能。给定一组字符串,查找某个字符串是否出现过,思路就是从根节点开始一个一个字符进行比较:

  • 如果沿路比较,发现不同的字符,则表示该字符串在集合中不存在。
  • 如果所有的字符全部比较完并且全部相同,还需判断最后一个节点的标志位(标记该节点是否代表一个关键字)。

Trie树的局限性

如前文所讲,Trie 的核心思想是空间换时间,利用字符串的公共前缀来降低查询时间的开销以达到提高效率的目的。

假设字符的种数有m个,有若干个长度为n的字符串构成了一个 Trie树 ,则每个节点的出度为 m(即每个节点的可能子节点数量为m),Trie树 的高度为n。很明显我们浪费了大量的空间来存储字符,此时Trie树的最坏空间复杂度为O(m^n)。也正由于每个节点的出度为m,所以我们能够沿着树的一个个分支高效的向下逐个字符的查询,而不是遍历所有的字符串来查询,此时Trie树的最坏时间复杂度为O(n)

这正是空间换时间的体现,也是利用公共前缀降低查询时间开销的体现。

LeetCode No. 208 problem is to achieve Trie (Prefix Tree) , junior partner interested can go to look at the practical operation.

I hope that today this article will help you recognize the master data structure can bring much help at work, we refuel :)

 

Guess you like

Origin www.cnblogs.com/fivestudy/p/10949774.html