Skip to content

kotsky/py-libs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py-libs

Custom Python 3 libs and learnings.

In this package I provide a description of different data structures and algorithms with my own implementation and usage cases.

Content

  1. Array
  2. Hash
  3. String
  4. Linked Lists
    1. Singly Linked List
    2. Doubly Linked List
  5. Memory cells replacement structures
    1. FIFO
    2. FILO
  6. Cache Strategies
  7. Trees
    1. Binary Search Tree
    2. Binary Heaps
      1. Min-Heap
      2. Max-Heap
      3. Continuous Median Handler
    3. Tries
  8. Graphs
    1. Common Graphs
    2. Fancy Graphs

  1. Bit Manipulation
  2. Dynamic Programming
  3. Back Tracking
  4. Recursion
  5. Math and Logic

  1. Sort
    1. Array Sort
    2. Linked List Sort
  2. Search
    1. Array Search Algorithms
    2. String Search
  3. Transformation
    1. Create BST from a sorted array
    2. Create DLL from BT: from left to right with flatten_binary_tree()
    3. Invert BT with invert_binary_tree()
  4. Traversal
    1. Create BST from a sorted array
    2. Create DLL from BT: from left to right
    3. Invert BT
  5. Validation
    1. Binary Tree Traversal
  6. Merge
    1. Array Merge Algorithms
  7. Fancy Algorithms
    1. Topological Sort
    2. Dijkstra Algorithm

Useful links

Data Structures

Array

Array (or list) is a data structure, which contains elements, and each of these elements have its own index. Python implementation: array = []

Complexity

  • Create - Time: O(N) / Space: O(N)
  • Insert/Delete/Search - Time: O(k) / Space: O(1), where k - element index.

Nuances

Once we exceed array limit of storing data, we double its size and searching for new memory place in case next memory cell is occupied (if it happens, it takes O(N) time).

Algorithms / Usage / Problems

DOn't forget to use method .copy() when you need to duplicate an array, because Python is full of pointers.

  1. Sorted Arrays

Use a property of sorted arrays with 2 pointers or binary search.

  • Find elements, sum of each equal to certain target number - 2 pointers.
  • Binary Search to find an element in O(log(N)) time.
  • Merge sorted arrays -> find more in Binary Heaps.
  1. Arrays
  • Use hash tables in case you need to find certain pair of elements in constant time. Also, hash tables can be used to find sequences in the array, where value in key-value points to next_node.

  • Find duplicates:

    • with hast table
    • sort arrays and find next same element
    • if there is certain condition regarding numbers like numbers < len(array), then use indexes of the array to track numbers with "-1" -> negative number will show if we meet the number before from that particular index value.
  • Subarray problems:

    • When you need to indentify certain spot in an array, try to use subarray of these adjacent element.
      • Define peek (longest peek) in an array -> take 3 adjacent pointers and define a peek, then explore it.
    • Kadanes Algorithm - to find the max possible sum of adjacent elements (subarray).
      • Here you should track local_sum and current_num (check if current_num is more than local_sum) and global_sum.
  • Many problems can be solved by traversing from the left (do something) and then from the right (do something). Hold it in your head. Subarray Sort

  • Also, according to above point, you can write down in additional array your calculation during moving to the right, and then do recalculation by going to the left.

Input = [a1, a2, a3]
What you can do in linear time?
additional_array = [x1, x2, x3]
By moving to the left:
	x1 = some first initializated value
	x2 = x1 * a1 (or similar)
	x3 = x2 * a2 (or similar)
By moving to the right:
	same from the end.
  • In problems to determine increasing/decreasing of the array, you can use variable direction to understand, where you are going. Monotonic Array
  • You might use elements in the array like graphs or LL. You can explore matrix by using graphs techniques.
  • Matrix traversal:
    • In certain problems of transformation from matrix to an 1D array, the direction variable can be in the use to navigate which next number has to be.
    • Also, there might be useful to keep in mind 4 pointers - 4 directions method - like to traverse in clockwise order.
  • Transformation:
    • In case of turning matrix, write down 4x4 example, its finale view and check, where is a pattern. Don't scare to just flip (horizontally or vertically) matrix as an intermediate step.

Hash

Python implementation: hash_table = {} Argument: key: value

table = {1: 2}
table[5] = 4
for key in table:
    print(key)
for value in table.values():
    print(value)

Complexity

  • Create - Time: O(N) / Space: O(N)
  • Insert/Delete/Search - Time: O(1) - avg, O(N) - worst (hash collidion) / Space: O(1) Take a note, that hash table has to read input (key) first, and then it will do search in O(1) time. That's why, true search with hash table is in O(K) Time, where K - size of input.

Nuances

O(N) time takes memory resizing of cache once we exceed its limit (like a dynamic array), for that, we do O(N) hashing.

Usage

  • Great to use when you need fast search/insert/delete in data massive.
  • There is no order in hash table, but you can add additional attribute as 'index' to assign it. Or to create LinkedList style by creating 'next_node'. Next node pointing can be used in many problems, where quick search is required and we need to have certain relation with that particular node.
  • Nice to use when need to count letters in strings, search if second string contains permutations of first one and so one in different variosity.

String

Python implementation: word = 'some_string' or to convert int 10 to string is str(10). Strings are immutable in py.

sentence = 'Just be a ranger in your soul'
splitted_sentence = sentence.split(' ')  # create list of words
symbol = '%20'
  
# splitted_sentence has to contain only string for .join method
print(symbol.join(splitted_sentence))  # insert symbol between strings

print(sentence.find('be'))  # find first appearance of 'be',
                            # if there is no, then return -1

Complexity

  • Create - Time: O(N) / Space: O(N)
  • Insert/Delete - Time: O(N) / Space: O(N) - We create brand new string once we insert or delete any letter there.
  • Search - Time: O(k) (k - index of letter) / Space: O(1)

Algorithms / Usage / Problems

Hold in your mind follows:

  • Hash tables can be your friends in problems of unique letters.
  • 2 pointers to check letters from different indexes.
  • Prefix and suffix - how you can use them to solve your problem?
  • Use tries data structure in problems of searching/matching + prefix /suffix.
  • Majority of usage is Ascii table.

Number conversion

  • Roman to Integer: create LUT for pattern searching. If there is no respective number in LUT, then we deal with new part of the number.

Search in string

  • Knuth Morris Pratt Algorithm Find if substring is in string in O(M + N) Time and O(M) space, where M is length of substring, and N is length of string.

  • Aho-Corasick Algorithm [TBD] Find if substring is in vocablary, where vocablary = [string_one, string_two, ...] in O(M + N + Z) Time / ??? Space.

  • With Tries Find if strings are in big_string in O(NS + BS) Time and O(NS) space, where N - number of substrings in small_strings, S - largest len element in small_strings, B - len of big_string. Tries are in the use for certain searching problems. Multi String Search

Unique letters

There are huge variety of tasks to compare letters in strings. Hash table are used widely for such topics:

  • Find duplicates in a string.
  • Compare uniques of strings.
  • Smallest Substring Containing - to find the smallest range in a big string, which contains all symbols of a small string. Here we use hash table to track symbols and 2 pointers for range adjustments ("window" method).

Edits in strings

  • How many edits needs to perform to get same anagrama of string2 from string1? One Edit Or Less
    • Some anagramma tasks can be solved by sort all strings into alphabet order.
  • Find min number of operation to be done to get string2 from string1. DP method. Min Number Of Edits

Compression / Matching

Main idea here is often to split string, and then to build back with certain modification using .join method.

  • Convert 'Have a power in your soul' into 'Have%20a%20power%20in%20your%20soul'.
  • Compress sequence of same letters into shorter format like 'aaabb' to 'a3b2'. Word Compression

Here we try to replace certain substrings with predeterminded patterns like "xxyxxy", "gogopowerrangergogopowerranger" => [x, y] = ["go", "powerranger"]. Here, we split string onto letters and store it in arrays, and then with pure math we calculate if it's possible to have that pattern for y if we have that range size of x.

Another example of patterns might be to find out special key word in a string, and to mark it with a special symbol like _. Example: Underscorify Substring: ['test is in testest this'], 'test' => '_test_ is in _testest_ this'. To solve this, simply find each appearance of the special word in the string, merge intervals if needed and create new string.

Rotation

  • String Rotation - check if string2 contains preffix of string1. And then check other letters per origin order.

Palindromes

Palindrome - a string which reads the same backwards. To check, is string is a palindrome, traverse with 2 pointers from the start and from the end and check if these letters are equivalent.

In case you need to find the longest palindrome is string, go through string and at each index explore in 2 directions if that substring is a palindome.

Permutation

Permutation - when you can change order of string. To solve those problems, Unique Letters topic comes to play.

Anagrams

If you need to work with anagrams like ['flop', 'olfp'], then you might want at first sort strings in alphabet order, and then compare strings and check, if they are anagrams.

Linked Lists

Python implementation:

class LLNode:
    def __init__(self, value):
        self.value = value
        #self.prev = None	# optional: for Doubly LL
        self.next = None

###Visualisation: Singly LL (SLL): None->1->2->3->4->5->None Doubly LL (DLL): None<->1<->2<->3<->4<->5<->None

Singly Linked List

Singly Linked List implementation

Methods:

  • add value
  • generate random SLL
  • SLL sorts: merge sort and bubble sort.
  • print SLL out

Doubly Linked List

Doubly Linked List implementation

Methods:

  • add node
  • set head/tail
  • insert node after/before/at idx
  • remove node / remove with value
  • print SLL out There is an advice to keep track a LL head and LL tail no matter with LL you are using. The tail (tail.next) gives instant node appending in the end of LL.

Complexity

  • Create/Copy - Time: O(N) / Space: O(N)
  • Get/Insert/Delete/Search (GIDS) - Time: O(i) / Space: O(1), where i - node index The beauty of LL is that you just need to reassign node before and node after to complete GIDS operation in O(1) Time. But at first you need to traverse LL to find that node.

Algorithms / Usage / Problems

LL is great to use when you have to track certain order of values/nodes, and you need to keep tracking head (start node) and tail (end node) of LL. For instance, LL is used to implement queues: FIFO, etc. A lot of problems can be solved by traversing and counting LL nodes. For instance, to delete K node from the end - count node to the end, find total number of nodes and then subtract K from it to define the node, which you have to delete/modify. For other problems, you might use 2-3 (3 for reversing) pointers with different node's steps at each iteration of traversing. The biggest difficulties are to assign previous and next nodes and handle head & tail properly.

Tips:

  • Try to visualize nodes and their relations. Then, draw a solution with arrows.
  • Remember about null pointer.
  • Update Head and Tail if needed.
  • A lot of problems can be solved in place (no extra memory use).

Sort

There are 2 implemented sort methods:

  • bubble sort: Time O(N^2) / Space (1)
  • merge sort: Time O(N*log(N)) / Space (log(N)) Refer to singly_linked_list.py

LRU Cahce

Implementation of a Least Recently Used Cache, which was implemented with a DLL.

Runner technique

You iterate LL with 2 pointers simultaneously, with one ahead other (fast and slow pointers).

Find loop

In your LL you might have a loop: Picture To define it, start traversing with 2 pointers, where every 1 iteration 1st one has step 1 (slow pointer) and 2nd one has step 2 (fast pointer). If pointers meet at certain point (at node 2*N), then you will have a loop. Better to drawn it by yourself. Example: Find a loop

Factorial replacement

a1->a2->...->an->b1->..->bn transform to a1->b1->..->an->bn - traverse with 2 pointers as in Find Loop above. When fast pointer hits the end point, slow pointer is at the middle. Then, return back fast pointer and do reassigning between slow and fast pointers.

Rearrange LL

Sometimes, when you need to rebuild LL based on some K value, it's nice to split LL onto 3 parts, build each separated LL (node.value < k), (node.value = k) and (node.value > k). Then, combine 3 parts into one finale LL. Be aware, that there might be duplicates or no existing of k value node. Example: Rearrange LL

Palindrome

Split linked list on 2 parts and check node by node. Example: SLL Palindrome Check

Memory cells replacement structures

The Last-In, First-Out (FILO) method assumes that the last unit to arrive in inventory or more recent is sold first. The First-In, First-Out (FIFO) method assumes that the oldest unit of inventory is the sold first. Complexity:

  • Create - Time: O(N) / Space: O(N)
  • Insert/Delete/Search - Time: O(k) / Space: O(+1)
  • Add (append) - Time: O(1) / Space: O(+1)
  • Pop last element - Time: O(1) / Space: O(-1)

FIFO

Or Stack works in FILO and simply can be implemented with an array.

Stack variations

There might be different types of stacks, except having common methods like add(), pop(), peek() and is_empty().

  • Simple Stack
  • MinMax Stack - to track min and max values -> to do that, we have additional min and max stacks, where we track min and max values at each index along common stack array.

Usage

Stacks are often useful in certain recursive algorithms. Sometimes you need to push some data in your memory, but then remove by returning back recursively. Also, stacks can be used to implement recursive algorithm iteratively. Another usage of stacks is to save elements in its initial order, and then delete them in backward. It can be used for Tracking Close & Open Brackets. Or, when you need to simplify something (like long path) into shorter version, like Shorter Path.

FIFO: queue implementation

Methods: add(), pop(), peek() and is_empty()

FILO

Or Queue works in FIFO and simply can be implemented with a Singly Linked List. Be aware that it's too easy to mess up the first and last nodes, so check them twice. Queue is used in Breadth-first search or other technics, where at first you append other nodes to explore, and then explore those nodes in order of adding them to queue.

FILO: stack implementation

Methods: add(), pop(), peek() and is_empty()

Cache Strategies

LRU Cache

LRU Cache implementation

Methods:

  • create with user's desired size
  • insert key-value pair
  • get value from key
  • get most recent key

Trees

Binary Tree

In this tree each node has max 2 childs. More in BST.

BT Algorithms / Problems / Usage

  • depth first search (DFS) - explore each branch before changing to other branch
    • recursively
    • iteratively with stack
  • breadth first search (BFS) - explore neighbours before their childs
  • Traversal methods
  • When you explore some path, you might try to represent it as a LL or an array. This might be helpful in problems of sum tracking in BT.
  • To defined if bt_one is a subtree of bt_two
    • do pre-order traversal (with includidng NULL nodes as some symbol X), which gives same result for the subtree and the tree if they are same by structure and values -> then do array matching.
    • alternative: call at each node match function. There is trade off between space and time complexities in v1 and v2. Discuss it.

Binary Search Tree

Binary Search Tree (BST) has the following condition: node.left.value < node.value <= node.right.value.

Binary Search Tree implementation

class Node:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None
		
class Tree:
    def __init__(self, value):
		self.root = Node(value)

Picture

Besides that, Balanced Binary Tree has log(N) depth, and not-balanced tree might reach full N depth.

Picture Picture Picture

BST Algorithms / Problems

There are the follows:

  • BST validation
  • Find out the min depth from a sorted array - or build a balanced BST or check it's depth by number of recursive calls. There are several things to keep tracking: 1) return root node as your output; 2) track parent node and how did you come to the current node (from the left or from the right?). In majority, recursion helps to solve BST problems.
  • Convert BST to all possible arrays - recursive call at each node to merge in different ways arrays from child nodes with parent node.
  • Return Random Node - track how many each node has childs, total size of BST and from that calculate probability of returning certain node.

Binary Heaps

Min-Heap and Max-Heap are a complete Binary Tree. In Min-Heap parent node has lower value than its child nodes. But in Max-Heap child nodes has bigger value then their parent node. Its implementation done by using an array structure.

Min-Heap

Min-Heap Array implementation Min-Heap Tree implementation Methods: insert(), pop(), peek() and is_empty()

Max-Heap

Min-Heap Array implementation Methods: insert(), pop(), peek() and is_empty()

Continuous Median Handler

The point is to track a median during adding more elements to the array.

Continuous Median Handler implementation

Complexity

  • Create - Time: O(N) - math reason / Space: O(N)
  • Insert/Delete/Search - Time: O(log(N)) / Space: O(1)

The attitude between children and parent nodes with array implementation are as follows:

  • parent_index = (child_index - 1) // 2
  • child_one_index = 2 * parent_index + 1 and child_two_index = 2 * parent_index + 2

The picture below shows attitude on Max-Heap example: Picture

Heaps Algorithms / Problems / Usage

Heaps can be used in various problems, where we have to track min and/or max values per some data structure like an array efficiently in log(N) time. There are listed few examples:

  1. When we need track a median value of some array, we can create 2 heaps: min and max, each of which give max value from the 1st part of the array and min value from the 2nd part of the array accordingly. Avaraging of these values gives the median value of the array. Continuous Median Handler
  2. Merge Sorted Arrays - Great usage of Min-Heap -> use min heap as a buffer exchange and comparison of smallest numbers from each subarray. It allows you to improve a speed of searching min value between subarrays from K to log(K), where K is a number of subarrays.

AVL Tree

Implementation

AVL tree is a self-balancing binary search tree in which each node maintains extra information called a balance factor whose value is either -1, 0 or +1. Balance factor of a node in an AVL tree is the difference between the height of the left subtree and that of the right subtree of that node. Balance Factor = (Height of Left Subtree - Height of Right Subtree) or (Height of Right Subtree - Height of Left Subtree) Picture

Complexity

Insertion Deletion Search O(log n) O(log n) O(log n)

AVL Tree Applications

  • For indexing large records in databases
  • For searching in large databases

B-tree

B-tree is a special type of self-balancing search tree in which each node can contain more than one key and can have more than two children.

Picture

Complexity

  • Insertion O(log n) Time / O(n) Space
  • Deletion O(log n) best / O(n) avg Time / Space
  • Search O(log n) Time / Space

Usage

The need for B-tree arose with the rise in the need for lesser time in accessing the physical storage media like a hard disk. The secondary storage devices are slower with a larger capacity. There was a need for such types of data structures that minimize the disk accesses.

Other data structures such as a binary search tree, avl tree, red-black tree, etc can store only one key in one node. If you have to store a large number of keys, then the height of such trees becomes very large and the access time increases.

However, B-tree can store many keys in a single node and can have multiple child nodes. This decreases the height significantly allowing faster disk accesses.

B-tree Properties

  • All leaf nodes have same depth.
  • Internal node has max (n-1) keys.
  • All node have max n childs and n/2 min childs.
  • There is addition bit info, which says "True" if the node is leaf, "False" if not.
  • h = log((n-1)/2)
  • Root node has at least 2 child and min 1 key.

B Tree Applications

  • databases and file systems
  • to store blocks of data (secondary storage media)
  • multilevel indexing

B+ tree

B+ tree is advanced b-tree, which is faster and easier to insert/search/delete. All values are contained in leaf nodes, which are multileveled indexed. There is a linear traversing through the keys in node to find out where to move next. We can optimise that search via binary search.

Picture

Complexity

  • Insertion O(t * log n) Time / O(n) Space
  • Deletion O(t * log n) best / O(n) avg Time & Space
  • Search O(t * log_t(n)) Time / O(log(n)) Space OR if binary search is applied => O(log(t) * log_t(n)) Time / O(log(n)) where t - time complexity of linear search on disk through keys in the node.

B+ tree Properties

  • All leaves are at same level.
  • All leaves are connected.
  • Root node has at least 2 child.
  • Node has max m childs and min m/2.
  • Node has max (m-1) keys and min m/2-1.

B+ Tree Applications

  • Multilevel Indexing
  • Faster operations on the tree (insertion, deletion, search)
  • Database indexing

Red-Black Tree

Implementation

Another type of self-balanced tree, which is a modified version of normal BST but with 1 addition bit of info - color: RED or BLACK. Difference with AVL tree - Red-Black tree is less balanced than AVL tree, but it requires lesser time for re-balancing. RB tree is great for higher frequency data storing, when there are a lot of inserting/deleting. AVL tree is great when you have a large dataset and you have mainly search.

Picture

Complexity

Insertion Deletion Search O(log n) O(log n) O(log n)

Red-Black Tree Properties

  • All nodes or RED or BLACK.
  • Root node is always BLACK.
  • Leaves (NUL nodes) are always BLACK.
  • Number of BLACK nodes from each node to any its NUL nodes is the same.
  • RED node has only BLACK childs.

Red-Black Tree Applications

  • To implement finite maps
  • To implement Java packages: java.util.TreeMap and java.util.TreeSet
  • To implement Standard Template Libraries (STL) in C++: multiset, map, multimap
  • In Linux Kernel

Splay Tree

Implementation

This is a binary search self-balanced tree, which returns to root node any node, which was just recently inserted/searched, without breaking BST properties. If the next lookup request is for the same element, it can be returned immediately.

Complexity

Importantly, splay trees offer amortized O(log(n)) performance; a sequence of M operations on an n-node splay tree takes O(M * log(n)) time.

Usage

A splay tree is an efficient implementation of a balanced binary search tree that takes advantage of locality in the keys used in incoming lookup requests. For many applications, there is excellent key locality. A good example is a network router. A network router receives network packets at a high rate from incoming connections and must quickly decide on which outgoing wire to send each packet, based on the IP address in the packet. The router needs a big table (a map) that can be used to look up an IP address and find out which outgoing connection to use. If an IP address has been used once, it is likely to be used again, perhaps many times. Splay trees can provide good performance in this situation.

Splay Tree Applications:

  • Network router
  • Recent look-up

Tries

Trie is a tree data structure to store characters at each node. Trie has one root node, multiple child nodes and might have end node, which can be presented as some special symbol, like *, and can indicate complete words.

Picture

There are:

  • Prefix Trie (build Trie from the beggining of the string)
  • Suffix Trie (build Trie from the end of the string)
	root->		babc
			abc
			bc
			c

Find its implementation here Tries

Tries Algorithms / Problems / Usage

Trie is often used to store vocabulary in one place with further easy accessing and searching in linear time. One classic problem is to find out if the given words are in some text

Graphs

A graph is a collection nodes with edges.

Picture

There are 2 types of graph representation:

  • Adjacency List (common representation)
  • Adjacency Matrix NxN

Picture

Graphs Search

The most common ways of searching in graphs are:

  • depth first search (DFS) - explore each branch before changing to other branch
  • breadth first search (BFS) - explore neighbours before their childs

BFS is great to find the shortest path, but DFS is easier to implement. DFS is implemented recursively, and BFS uses Queue. Take a note, that in graphs we have to track which nodes we already visited before. In some problems, we might need track nodes as visited before and visiting right now.

  • bidirectional search - BFS from 2 graphs source to define node of their collision. Bidirectional search is faster rather normal BFS, because we don't traverse through all nodes around our target nodes. Mathematically, time complexity for traditional BFS search is O(k^(d)) time, where k - nodes, d - shortest path from target nodes 1 to 2. But there is O(k^(d/2)) time for bidirectional search.

Simple Graph implementation

Methods:

  • create graph from vertices and edges
  • add nodes/dependencies to graph
  • depth first search (DFS) from the start node
  • breadth first search (BFS) from the start node

Algorithms / Usage / Problems

Graphs are in the use widely in problems, where different nodes are combined into system of dependencies (social network relations). Also, it can be useful to explore how nodes relate to each others to build certain sequences to complete certain tasks. For instance, Topological Sort solves the problem, where we have to understand in which order we have to complete jobs, assuming that some jobs can be completed only after others.

Another usage can be found in matrices. You can imagine it as maps problems - find a path from something to somethings, or explore a path and define if this path is what you are looking for.

Fancy Graphs

Techniques

Bit Manipulation

Bit Manipulation Problems

Indicators:

^ - XOR

~ - NOT

Tips

Take a note, that a^(~a) => 1111... (-1) and a^a => 0.

Negative representation: -k => contack (1, 2^(N-1) - k)

Example: -3 in 4 bits => 8 - 3 = 5 => 101. contact(1, 101) => 1101

Picture

Try to use info about "base of 2" to solve your problems: use % 2 and // 2 with its immediate process operation instead saving bits into an array list.

Right shifts

Arithmetic right shift: a >> 1 shift by 1 bit and fill sign bit with same value. So the results is to divide on 2, or -75 >> 1 => -38. Logic right shift: a >>> 1 shift by 1 bit and fill the most left bit spot with 0. So -75 >>> 1 => 90.

Recursion

The best way to understand if the problem f(n) can be solved with recursion, is it try to solve f(n-1) / f(1) and then f(n-2) / f(2) accordingly. Can we solve the problem by solving its subproblems?

In addition to esimation of big O Time complexity: if recursion splits function on X other subproblem, then you might have O(X^N) Time. Draw recursion tree to estimate it.

Dynamic Programming

Dynamic Programming Problems

Dynamic Programming is the method of solving tasks, when you solve subtasks of this main task and store results at some array/data str (caching). And reuse it when it's needed (*next recursive call).

Few main technics:

  • Store max/min/special values in the same size array/matrix and use them for next steps/calculation.
  • Store indexes to track (sequence problem) -> like what happens if that element at that index is included in the finale sequence?

Tips

When you have a task to compare 2 strings/arrays, to find there special subsequences or whatever, try to build 2D matrix to solve index by index. Find pattern and return last element of the matrix, or use backtrack to build a certain sequence.

Examples:

  • Min Number of Exchange - you need to split amount n with coin denominations. So, you start solving from n = 0 until its n value. Try to define a pattern. Focus on denoms and their value. How they impact on the result?
  • Min Jumps to Reach the End - is okay to solve with DP (standard technique[, but there is a smart way. Count, how many steps you have from an actual number. Then track the max how far you can jump using next jump + these current steps. Once you exceed steps, update it by subtract your current position with max possible reached by you & jump += 1. When you have the results, which depends on 2 inputs -> build matrix.
  • Min Number of Edit / Levenshtein Distance (2 strings) - build a matrix of same size by row and column for 2 input strings/arrays which are given to compare. Let matrix[i][j] - number of edits at i and j indexes. Then matrix[i-1][j] is delete letter at string-i, matrix[i][j-1] means delete letter ar string-j and matrix[i-1][j-1] means swap letters between 2 strings.
  • Knapsack Problem - to find out max possible value of items, which you can put into a bag with certain capacity (2 inputs: value and weight of items + bag capacity) -> build matrix of capacity(items) and track value/cap for each next solution. Then, do backtrack to count your items.
  • When you have certain problem to calculate something to the left and to the right from an idx in array, you can calculate its left solution by traversing from the left to the right, and then its right solution by traversing from the right to the left. Then, combine these 2 results into one finale cell. Water Area
  • Max Profit with K Operation - try to come up with idea, that at first you have 0 at your balance, then at some day you buy stock (track the min sum you spent to buy stocks), and then calculate max profit per each next index.
  • Min String Cut to have palindromes Build a matrix, which contains info about if at that index this substring is a palindrome at each index. Then, With new array, check how you can cut string on palindromes. If you find a palindrome, then you have 0 cuts at that particular index.
  • Longest Increasing Subsequence Here, use additional array to track idx before in sequences. There are 2 methods: for loop with nested for loop to identify max sequence from solutions before; 2) build an array, where you store temporary values and try insert new number into that array -> if is possible to insert, replace number at that index by its new number and grab info of replaced number (info about what idx/num was before in sequence of replaced number). Use binary search for log(N) idx search of replacement.

Math and Logic

In many tasks to identify uncommon sample, try to assign special attribute or unique numbers to each sample so once you measure it, you can identify this special sample with some math operation. For instance, binary representation of each sample can give fast result of poisoned one.

Basics

Any number can be combined by multiplication of prime numbers. Any number = 2^(a1) * 3^(a2) * 5^(a3) *...

To check if a number is prime, define a * b = n and check for a <= sqrt(n) if number % a. Don't need to explore b, because it gives the same answer as a.

Probability AND: P(A + B) = P(B given A) * P(A)

Picture

Probability OR: P(A or B) = P(A) + P(B) - P(A and B)

Picture

TBD

Object-Oriented Design

This stack of ploblems is to understand your programming style.

Question: design a coffee maker. Steps to solve it:

  1. Gather more about the target object: what, when, who, where, when, how, why? -> 1) is it single coffee machine for a small group of people for just simple black coffee? 2) is it massive restaurant service for hundreds of customers per hour?
  2. Define the core objects: like Table, Order, Server, Guest, etc.
  3. Define relationships between classes: which class relies on other?
  4. Investigate Actions: what each object/class/entety perform in your system. Here my find out new classes which your forgot to include into your design: Reservation.

Interesting key points:

  • Name mangling is helpful for letting subclasses override methods without breaking intraclass method calls. For example:
class Mapping:
    def __init__(self, iterable):
        self.items_list = []
        self.__update(iterable)

    def update(self, iterable):
        for item in iterable:
            self.items_list.append(item)

    __update = update   # private copy of original update() method

class MappingSubclass(Mapping):

    def update(self, keys, values):
        # provides new signature for update()
        # but does not break __init__()
        for item in zip(keys, values):
            self.items_list.append(item)
  • You can create your class iteration way:
class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return iter(self.data)


rev = Reverse("spam")
for l in rev: # instead of rev.data
    print(l)
"""
s
p
a
m
"""

Or with next()

class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]

Algorithms

Sort

  • Merge Sort -> copy array and go recursively to 2 elements, sort them and then merge with others during returning from the recursion.
  • Heap Sort
  • Quick Sort -> pivot + 2 pointers -> we search the pivot placement in an array like if this array would be sorted.
  • Selection Sort
  • Insertion Sort
  • Bubble Sort

Linked Lists Sort

Refer to Singly Linked List and Doubly Linked List to find sort methods out.

Search

  • Binary Search -> to find an element in a sorted array
  • Quick Select -> to find kth smallest element in the array -> we search the pivot placement in an array like if this array would be sorted. There are 3 conditions for left_pointer, right_pointer and pivot:
  1. if A[l] > A[p] > A[r] => swap(l, r)
  2. if A[l] <= A[p] => l++
  3. of A[r] >= A[p] => r--
  • Search In Sorted Matrix -> to find an element in a sorted matrix

String Search

  • Knuth Morris Pratt -> to define if "string" contains "substring" in O(N + M) time and O(M) space complexity.
  • Multi String Search -> find if strings are in big_string in O(NS + BS) Time and O(NS) Space, where N - number of substrings in small_strings, S - largest len element in small_strings, B - len of big_string.
  • Aho-Corasick Algorithm [TBD]

Transformation

Create DLL from BT: from left to right with flatten_binary_tree()

Invert BT with invert_binary_tree()

  • Input string '[10, 20, 30], 2' => output array=[10, 20, 30], k=2

Traversal

  • in_order_traverse -> add left branch, then current node and then right branch
  • pre_order_traverse -> add node to order before its child nodes
  • post_order_traverse -> add node after its child nodes

Validation

Merge

  • Merge Sorted Arrays in O(N*log(K) + K) Time / O(N + K) Time, where N - total number of elements and K - number of subarrays.

Fancy Algorithms

  • Topological Sort - graph - to return correct order of job accomplishment, when each job can be completed if few others were completed.
  • Dijkstra Algorithm - graph - to return the shortest paths from a stat node to all other nodes, if it takes some time/length/weight to move from one node to another one (int positive).
  • A Star Algorithm - graph - to return the shortest path from a start point to the end point on a grid (matrix) with obstacles, based on certain score of each node and the order of exploring these nodes.