Data Structure
HashTable / HashMap
https://algs4.cs.princeton.edu/34hash/
Hash function should be:
- It should be deterministic—equal keys must produce the same hash value.
- It should be efficient to compute.
- It should uniformly distribute the keys.
Collision Resolutions:
- separate chaining -- a linked list
- linear probing -- find next available empty slot
Java, 3 things you should know about hashCode():
- An object’s hash code allows algorithms and data structures to put objects into compartments for hash-based collections, such as HashMap and HashSet.
Objects that are equal must have the same hash code within a running process. (may not remain consistent from one execution of an application to another execution of the same application.)
Unequal objects must have different hash codes – WRONG!
Objects with the same hash code must be equal – WRONG!
Whenever you implement equals, you MUST also implement hashCode.
If you fail to do so, you will end up with broken objects. Why? An object’s hashCode method must take the same fields into account as its equals method. By overriding the equals method, you’re declaring some objects as equal to other objects, but the original hashCode method treats all objects as different. So you will have equal objects with different hash codes. For example, calling contains() on a HashMap will return false, even though the object has been added.
Key Composition
Some grouping problem can be solved easily with hash map lookup if we can figure the same key for the group.
- Anagram: the key can be character counts / characters in sorted order (string). 49. Group Anagrams
- Sequence: 249. Group Shifted Strings
- DFS traversal signature: 694. Number of Distinct Islands
- PreSum map to Index: 325. Maximum Size Subarray Sum Equals k
Tries Tree
https://www.youtube.com/watch?v=zIjfhVPRZCg
208 Implement Trie (Prefix Tree)
If we need to use the word string, we might store the word string instead of isCompleteWord indicator.
Optimise Word Validation: Lookup based on previous call. Return node reference in the search method.
Avoid duplicated result by removing the word from the tries tree dict. Then no need to remove duplicates in results set.
private class Node {
Map<Character, Node> children = new HashMap<>();
//or store the String word and use ==null to indicate isCompleteWord,
//depending on the problem
boolean isWord = false;
}
Implement a MapSum class with insert, and sum methods.
For the method insert, you'll be given a pair of (string, integer).
The string represents the key and the integer represents the value.
If the key already existed, then the original key-value pair will be overridden to the new one.
For the method sum, you'll be given a string representing the prefix,
and you need to return the sum of all the pairs' value whose key starts with the prefix.
Example 1:
Input: insert("apple", 3), Output: Null
Input: sum("ap"), Output: 3
Input: insert("app", 2), Output: Null
Input: sum("ap"), Output: 5
A few optimized coding styles:
- No need isWord indicator because the value of the node is default to 0 if it is not a key. Summation of zero just like ignoring the non-key nodes.
- for(char c: key.toCharArray()) will simplify the code.
class MapSum {
class Node {
Map<Character, Node> children;
int val;
public Node() {
children = new HashMap<Character, Node>();
val = 0;
}
}
private Node root;
/** Initialize your data structure here. */
public MapSum() {
root = new Node();
}
public void insert(String key, int val) {
Node curr = root;
for(char c: key.toCharArray()) {
Node next = curr.children.get(c);
if(next==null) {
next = new Node();
curr.children.put(c, next);
}
curr = next;
}
curr.val = val;
}
public int sum(String prefix) {
Node curr = root;
for(char c: prefix.toCharArray()) {
Node next = curr.children.get(c);
if(next==null) return 0;
curr = next;
}
return getSumWithPrefix(curr);
}
public int getSumWithPrefix(Node start) {
int sum = 0;
for(Node child: start.children.values()) {
sum += getSumWithPrefix(child);
}
return sum + start.val;
}
}