Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

Development Tip

Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

yourdevel 2020. 10. 27. 23:34

Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

Strings에서 다양한 객체로의 많은 매핑을 저장하는 Java 프로그램이 있습니다.

지금 내 옵션은 해싱 (HashMap을 통해) 또는 이진 검색 (TreeMap을 통해)에 의존하는 것입니다. 인기 있고 품질이 좋은 컬렉션 라이브러리에 효율적이고 표준 트라이 기반지도 구현이 있는지 궁금합니다.

나는 과거에 내 자신의 글을 썼지 만 가능한 경우 표준을 사용하고 싶습니다.

빠른 설명 : 제 질문은 일반적이지만 현재 프로젝트에서는 정규화 된 클래스 이름 또는 메서드 서명으로 인덱싱 된 많은 데이터를 다루고 있습니다. 따라서 많은 공유 접두사가 있습니다.

Limewire가 Google Guava에 기여 하고있는 Trie 구현 을 살펴볼 수 있습니다 .

핵심 Java 라이브러리에는 trie 데이터 구조가 없습니다.

시도는 일반적으로 문자열을 저장하도록 설계되는 반면 Java 데이터 구조는 일반적으로 임의 Object(동등성 및 해시 작업 정의)를 보유하고 있지만 때때로 Comparable객체 (순서 정의)로 제한 되기 때문일 수 있습니다 . CharSequence문자열에 적합 하지만 "기호 시퀀스"에 대한 일반적인 추상화는 없으며 Iterable다른 유형의 기호에 대해 뭔가를 할 수 있다고 생각합니다 .

고려해야 할 또 다른 사항이 있습니다. Java에서 기존의 시도를 구현하려고 할 때 Java가 유니 코드를 지원한다는 사실에 금방 직면하게됩니다. 어떤 종류의 공간 효율성을 얻으려면 트라이의 문자열을 기호의 일부 하위 집합으로 제한하거나 기호로 인덱싱 된 배열에 자식 노드를 저장하는 기존 방식을 포기해야합니다. 이것이 시도가 코어 라이브러리에 포함되기에 충분한 범용으로 간주되지 않는 또 다른 이유이며, 자체 라이브러리를 구현하거나 타사 라이브러리를 사용하는 경우주의해야 할 사항입니다.

동시 트리 도 확인하십시오 . 기수 및 접미사 트리를 모두 지원하며 동시성이 높은 환경을 위해 설계되었습니다.

Apache Commons Collections v4.0은 이제 trie 구조를 지원합니다.

자세한 내용은 org.apache.commons.collections4.trie패키지 정보 를 참조하십시오. 특히 PatriciaTrie클래스를 확인하십시오 .

PATRICIA Trie (영숫자로 코딩 된 정보를 검색하는 실제 알고리즘)의 구현.

PATRICIA Trie는 압축 된 Trie입니다. 모든 데이터를 Trie의 가장자리에 저장하는 대신 (그리고 내부 노드가 비어 있음) PATRICIA는 모든 노드에 데이터를 저장합니다. 이를 통해 매우 효율적인 순회, 삽입, 삭제, 선행 작업, 후속 작업, 접두사, 범위 및 select (Object) 작업이 가능합니다. 모든 작업은 O (K) 시간에 최악으로 수행됩니다. 여기서 K는 트리에서 가장 큰 항목의 비트 수입니다. 실제로 작업에는 실제로 O (A (K)) 시간이 걸립니다. 여기서 A (K)는 트리에있는 모든 항목의 평균 비트 수입니다.

여기에 간단하고 빠른 구현을 작성하고 게시했습니다 .

Apache의 공용 컬렉션 : org.apache.commons.collections4.trie.PatriciaTrie

당신에게 필요한 것은 org.apache.commons.collections.FastTreeMap라고 생각합니다.

Completely Java 라이브러리를 사용해 볼 수 있으며 PatriciaTrie 구현 이 특징 입니다. API는 작고 시작하기 쉬우 며 Maven 중앙 저장소 에서 사용할 수 있습니다 .

이 TopCoder도 볼 수 있습니다 (등록 필요 ...).

정렬 된지도가 필요한 경우 시도해 볼 가치가 있습니다. 그렇지 않으면 해시 맵이 더 좋습니다. 문자열 키가있는 해시 맵 은 표준 Java 구현보다 향상 될 수 있습니다. 배열 해시 맵

Scala 라이브러리를 가져 오는 것에 대해 걱정하지 않으시면 제가 작성한 burst trie 에서이 공간 효율적인 구현을 사용할 수 있습니다 .

https://github.com/nbauernfeind/scala-burst-trie

아래는 Trie의 기본 HashMap 구현입니다. 어떤 사람들은 이것이 유용하다고 생각할 수 있습니다 ...

class Trie {

    HashMap<Character, HashMap> root;

    public Trie() {
        root = new HashMap<Character, HashMap>();
    }

    public void addWord(String word) {
        HashMap<Character, HashMap> node = root;
        for (int i = 0; i < word.length(); i++) {
            Character currentLetter = word.charAt(i);
            if (node.containsKey(currentLetter) == false) {
                node.put(currentLetter, new HashMap<Character, HashMap>());
            }
            node = node.get(currentLetter);
        }
    }

    public boolean containsPrefix(String word) {
        HashMap<Character, HashMap> node = root;
        for (int i = 0; i < word.length(); i++) {
            Character currentLetter = word.charAt(i);
            if (node.containsKey(currentLetter)) {
                node = node.get(currentLetter);
            } else {
                return false;
            }
        }
        return true;
    }
}

여기에 내 구현이 있습니다. GitHub-MyTrie.java

/* usage:
    MyTrie trie = new MyTrie();
    trie.insert("abcde");
    trie.insert("abc");
    trie.insert("sadas");
    trie.insert("abc");
    trie.insert("wqwqd");
    System.out.println(trie.contains("abc"));
    System.out.println(trie.contains("abcd"));
    System.out.println(trie.contains("abcdefg"));
    System.out.println(trie.contains("ab"));
    System.out.println(trie.getWordCount("abc"));
    System.out.println(trie.getAllDistinctWords());
*/

import java.util.*;

public class MyTrie {
  private class Node {
    public int[] next = new int[26];
    public int wordCount;
    public Node() {
      for(int i=0;i<26;i++) {
        next[i] = NULL;
      }
      wordCount = 0;
    }
  }

  private int curr;
  private Node[] nodes;
  private List<String> allDistinctWords;
  public final static int NULL = -1;

  public MyTrie() {
    nodes = new Node[100000];
    nodes[0] = new Node();
    curr = 1;
  }

  private int getIndex(char c) {
    return (int)(c - 'a');
  }

  private void depthSearchWord(int x, String currWord) {
    for(int i=0;i<26;i++) {
      int p = nodes[x].next[i];
      if(p != NULL) {
        String word = currWord + (char)(i + 'a');
        if(nodes[p].wordCount > 0) {
          allDistinctWords.add(word);
        }
        depthSearchWord(p, word);
      }
    }
  }

  public List<String> getAllDistinctWords() {
    allDistinctWords = new ArrayList<String>();
    depthSearchWord(0, "");
    return allDistinctWords;
  }

  public int getWordCount(String str) {
    int len = str.length();
    int p = 0;
    for(int i=0;i<len;i++) {
      int j = getIndex(str.charAt(i));
      if(nodes[p].next[j] == NULL) {
        return 0;
      }
      p = nodes[p].next[j];
    }
    return nodes[p].wordCount;
  }

  public boolean contains(String str) {
    int len = str.length();
    int p = 0;
    for(int i=0;i<len;i++) {
      int j = getIndex(str.charAt(i));
      if(nodes[p].next[j] == NULL) {
        return false;
      }
      p = nodes[p].next[j];
    }
    return nodes[p].wordCount > 0;
  }

  public void insert(String str) {
    int len = str.length();
    int p = 0;
    for(int i=0;i<len;i++) {
      int j = getIndex(str.charAt(i));
      if(nodes[p].next[j] == NULL) {
        nodes[curr] = new Node();
        nodes[p].next[j] = curr;
        curr++;
      }
      p = nodes[p].next[j];
    }
    nodes[p].wordCount++;
  }
}

내 자신의 동시 TRIE 구현을 시도했지만 문자를 기반으로하지 않고 HashCode를 기반으로합니다. 여전히 우리는 각 CHAR hascode에 대해 Map of Map을 사용하여 사용할 수 있습니다.
@ https://github.com/skanagavelu/TrieHashMap/blob/master/src/TrieMapPerformanceTest.java https://github.com/skanagavelu/TrieHashMap/blob/master/src/TrieMapValidationTest.java 코드를 사용하여이를 테스트 할 수 있습니다.

import java.util.concurrent.atomic.AtomicReferenceArray;

public class TrieMap {
    public static int SIZEOFEDGE = 4; 
    public static int OSIZE = 5000;
}

abstract class Node {
    public Node getLink(String key, int hash, int level){
        throw new UnsupportedOperationException();
    }
    public Node createLink(int hash, int level, String key, String val) {
        throw new UnsupportedOperationException();
    }
    public Node removeLink(String key, int hash, int level){
        throw new UnsupportedOperationException();
    }
}

class Vertex extends Node {
    String key;
    volatile String val;
    volatile Vertex next;

    public Vertex(String key, String val) {
        this.key = key;
        this.val = val;
    }

    @Override
    public boolean equals(Object obj) {
        Vertex v = (Vertex) obj;
        return this.key.equals(v.key);
    }

    @Override
    public int hashCode() {
        return key.hashCode();
    }

    @Override
    public String toString() {
        return key +"@"+key.hashCode();
    }
}


class Edge extends Node {
    volatile AtomicReferenceArray<Node> array; //This is needed to ensure array elements are volatile

    public Edge(int size) {
        array = new AtomicReferenceArray<Node>(8);
    }


    @Override
    public Node getLink(String key, int hash, int level){
        int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
        Node returnVal = array.get(index);
        for(;;) {
            if(returnVal == null) {
                return null;
            }
            else if((returnVal instanceof Vertex)) {
                Vertex node = (Vertex) returnVal;
                for(;node != null; node = node.next) {
                    if(node.key.equals(key)) {  
                        return node; 
                    }
                } 
                return null;
            } else { //instanceof Edge
                level = level + 1;
                index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
                Edge e = (Edge) returnVal;
                returnVal = e.array.get(index);
            }
        }
    }

    @Override
    public Node createLink(int hash, int level, String key, String val) { //Remove size
        for(;;) { //Repeat the work on the current node, since some other thread modified this node
            int index =  Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
            Node nodeAtIndex = array.get(index);
            if ( nodeAtIndex == null) {  
                Vertex newV = new Vertex(key, val);
                boolean result = array.compareAndSet(index, null, newV);
                if(result == Boolean.TRUE) {
                    return newV;
                }
                //continue; since new node is inserted by other thread, hence repeat it.
            } 
            else if(nodeAtIndex instanceof Vertex) {
                Vertex vrtexAtIndex = (Vertex) nodeAtIndex;
                int newIndex = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, vrtexAtIndex.hashCode(), level+1);
                int newIndex1 = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level+1);
                Edge edge = new Edge(Base10ToBaseX.Base.BASE8.getLevelZeroMask()+1);
                if(newIndex != newIndex1) {
                    Vertex newV = new Vertex(key, val);
                    edge.array.set(newIndex, vrtexAtIndex);
                    edge.array.set(newIndex1, newV);
                    boolean result = array.compareAndSet(index, vrtexAtIndex, edge); //REPLACE vertex to edge
                    if(result == Boolean.TRUE) {
                        return newV;
                    }
                   //continue; since vrtexAtIndex may be removed or changed to Edge already.
                } else if(vrtexAtIndex.key.hashCode() == hash) {//vrtex.hash == hash) {       HERE newIndex == newIndex1
                    synchronized (vrtexAtIndex) {   
                        boolean result = array.compareAndSet(index, vrtexAtIndex, vrtexAtIndex); //Double check this vertex is not removed.
                        if(result == Boolean.TRUE) {
                            Vertex prevV = vrtexAtIndex;
                            for(;vrtexAtIndex != null; vrtexAtIndex = vrtexAtIndex.next) {
                                prevV = vrtexAtIndex; // prevV is used to handle when vrtexAtIndex reached NULL
                                if(vrtexAtIndex.key.equals(key)){
                                    vrtexAtIndex.val = val;
                                    return vrtexAtIndex;
                                }
                            } 
                            Vertex newV = new Vertex(key, val);
                            prevV.next = newV; // Within SYNCHRONIZATION since prevV.next may be added with some other.
                            return newV;
                        }
                        //Continue; vrtexAtIndex got changed
                    }
                } else {   //HERE newIndex == newIndex1  BUT vrtex.hash != hash
                    edge.array.set(newIndex, vrtexAtIndex);
                    boolean result = array.compareAndSet(index, vrtexAtIndex, edge); //REPLACE vertex to edge
                    if(result == Boolean.TRUE) {
                        return edge.createLink(hash, (level + 1), key, val);
                    }
                }
            }               
            else {  //instanceof Edge
                return nodeAtIndex.createLink(hash, (level + 1), key, val);
            }
        }
    }




    @Override
    public Node removeLink(String key, int hash, int level){
        for(;;) {
            int index = Base10ToBaseX.getBaseXValueOnAtLevel(Base10ToBaseX.Base.BASE8, hash, level);
            Node returnVal = array.get(index);
            if(returnVal == null) {
                return null;
            }
            else if((returnVal instanceof Vertex)) {
                synchronized (returnVal) {
                    Vertex node = (Vertex) returnVal;
                    if(node.next == null) {
                        if(node.key.equals(key)) {
                            boolean result = array.compareAndSet(index, node, null); 
                            if(result == Boolean.TRUE) {
                                return node;
                            }
                            continue; //Vertex may be changed to Edge
                        }
                        return null;  //Nothing found; This is not the same vertex we are looking for. Here hashcode is same but key is different. 
                    } else {
                        if(node.key.equals(key)) { //Removing the first node in the link
                            boolean result = array.compareAndSet(index, node, node.next);
                            if(result == Boolean.TRUE) {
                                return node;
                            }
                            continue; //Vertex(node) may be changed to Edge, so try again.
                        }
                        Vertex prevV = node; // prevV is used to handle when vrtexAtIndex is found and to be removed from its previous
                        node = node.next;
                        for(;node != null; prevV = node, node = node.next) {
                            if(node.key.equals(key)) {
                                prevV.next = node.next; //Removing other than first node in the link
                                return node; 
                            }
                        } 
                        return null;  //Nothing found in the linked list.
                    }
                }
            } else { //instanceof Edge
                return returnVal.removeLink(key, hash, (level + 1));
            }
        }
    }

}



class Base10ToBaseX {
    public static enum Base {
        /**
         * Integer is represented in 32 bit in 32 bit machine.
         * There we can split this integer no of bits into multiples of 1,2,4,8,16 bits
         */
        BASE2(1,1,32), BASE4(3,2,16), BASE8(7,3,11)/* OCTAL*/, /*BASE10(3,2),*/ 
        BASE16(15, 4, 8){       
            public String getFormattedValue(int val){
                switch(val) {
                case 10:
                    return "A";
                case 11:
                    return "B";
                case 12:
                    return "C";
                case 13:
                    return "D";
                case 14:
                    return "E";
                case 15:
                    return "F";
                default:
                    return "" + val;
                }

            }
        }, /*BASE32(31,5,1),*/ BASE256(255, 8, 4), /*BASE512(511,9),*/ Base65536(65535, 16, 2);

        private int LEVEL_0_MASK;
        private int LEVEL_1_ROTATION;
        private int MAX_ROTATION;

        Base(int levelZeroMask, int levelOneRotation, int maxPossibleRotation) {
            this.LEVEL_0_MASK = levelZeroMask;
            this.LEVEL_1_ROTATION = levelOneRotation;
            this.MAX_ROTATION = maxPossibleRotation;
        }

        int getLevelZeroMask(){
            return LEVEL_0_MASK;
        }
        int getLevelOneRotation(){
            return LEVEL_1_ROTATION;
        }
        int getMaxRotation(){
            return MAX_ROTATION;
        }
        String getFormattedValue(int val){
            return "" + val;
        }
    }

    public static int getBaseXValueOnAtLevel(Base base, int on, int level) {
        if(level > base.getMaxRotation() || level < 1) {
            return 0; //INVALID Input
        }
        int rotation = base.getLevelOneRotation();
        int mask = base.getLevelZeroMask();

        if(level > 1) {
            rotation = (level-1) * rotation;
            mask = mask << rotation;
        } else {
            rotation = 0;
        }
        return (on & mask) >>> rotation;
    }
}

참고URL : https://stackoverflow.com/questions/623892/where-do-i-find-a-standard-trie-based-map-implementation-in-java

'Development Tip' 카테고리의 다른 글

트리 구조 간의 차이 감지 (0)	2020.10.27
PostgreSQL 프로세스가 "트랜잭션 유휴"상태라는 것은 무엇을 의미합니까? (0)	2020.10.27
F #의 애플리케이션 아키텍처 / 구성 (0)	2020.10.27
Android 개발을 위해 Vim을 어떻게 설정할 수 있습니까? (0)	2020.10.27
MySQL 데이터베이스를 SQLite 데이터베이스로 내보내기 (0)	2020.10.27

현재글Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

yourdevel

Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

'Development Tip' 카테고리의 다른 글

'Development Tip'의 다른글

티스토리툴바

Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

Java에서 표준 Trie 기반 맵 구현은 어디에서 찾을 수 있습니까?

'Development Tip' 카테고리의 다른 글

'Development Tip'의 다른글

관련글

티스토리툴바