Sunday, 4 April 2021

Write Ahead Logging

In applications that needs to run continuously, ability to recover is key when a component goes down. If the component is transactional in nature we need to ensure that in the event of crash, system stays consistent with operations done before the crash.

To ensure this we need to maintain state of the component in a store that is reliable and persists accross crash. While there are multiple ways to persist the state, a naive could be to take a snapshot of the entire data every time we do an update. This can be a expensive operation if updates are frequent and state is too big. But this will be useful when updates are periodic and less frequent and update path doesn't block the read path (mostly by maintaining a version of the data all the time to be served). Other way of persisting the entire state is to maintain a log of all the updates happened from the begining thus on update we just have to log the update command. This greatly reduces the amount of time taken to persist the update but recovery can take a lot of time. This process of Write-Ahead-Logging helps when updates are frequent enough that you can't maintain a different version of whole data each time.

Some drawbacks of logging is that logs can grow indefinitely unlike snapshots hence we might need to do periodic snapshoting to maintain a low watermark beyond which data is stored in a snapshot and rest of updates needs to replayed from the log. This will also ease the load recovery time as most of the data is recovered from the alsnapshot and only recent logs are replayed.

Write-Ahead-Logging can be used for wide variety of applications, from a simple key value store to a huge database. 

Saturday, 17 October 2020

Diameter of a tree

 Problem: Given edges for a tree find the diameter of the tree or the longest path in the tree.

Sample Input: 

4

5 1

1 3

1 2

2 4


Representation of tree: Tree can be represented in multiple ways, 

1. If we simplify the problem and say it is a binary tree we can represent it as array where a[i]-> left is a[2*i] and a[i]->right is a[2*i+1]. 

2. One other way to represent it is by using parent array: parrent[i] = j if there is an edge j->i but the tree needs to be directed in this case

3. We can have a struct/class Node which has pointers to left and right

Node { 

    int val, 

    Node* left, 

    Node* right

}


To calculate the diameter of a tree we have to pick a root and do dfs/bfs to find the left node with maximum distance and find the farthest node from it.

Tuesday, 25 August 2020

Sorting and Searching [CSES] [Day-2]

Movie Festival

Given starting and endings time of movies, return maximum number of movies we can watch. Could come up with recurcive solution where f(1...n) = max ( 1 + f(k...n), f(2..n)) which can be memoized and dp can be used to reduce the time taken. But it seems there is a greedy approach where we sort by ending time and watch it in that order. It seem to give the maximum number of movies. Need to prove it. Two action items pending here - 
    1. Implement DP approach. 
    2. Prove this greedy approach

.

Sunday, 23 August 2020

Sorting and Searching [CSES] [DAY-1]

Distinct Number

Given n integers find number of disctinct integers among them. Constraint : 'n <=    2 * 10^5'. Tried by maintaining a unorder_set first but that times out when there are 2 * 10^5 unique integers. But storing it in a vector and sorting all at time seem to work fine. Good to find what is the complexity when we use unordered_set and why is it slow?

Ferris Wheel

Given n integers divide them into groups of 2 or 1 so that sum of integers in the group doesn't exceed x. We just have to sort the array and maintain two pointers going from front and back. Intersting thing is while initializing a vector when the reserve function is used sort function doesn't work? but when resize is use it works.

Concert Ticket

Given n tickets and m customer. Each ticket has a price and customers have a certain maximum value they can spend on the ticket. If customers come one by one, what will be the price for each customer?  To solve the first tried to sort it and do binary search for each customer. But since multiple tickets can have same price and we can't reuse a ticket binary search can be linear. So we use this new concept called Coordinate Compression which means we can reduce the size of sorted vector if duplicate values can be eliminated.


Wednesday, 19 August 2020

Introductory Problems [CSES]

Weird Algorithm

This is a simulation-based problem, we have to multiply the number with 3 and add one if it is odd or divide it by 2 if it is even. Do it till we reach 1.

Missing Numbers

We will be given a number n and n-1 number from 1 to n. We have to find the missing number. Maintained an arr of length n and marked the entry 1 if it appeared. 
Space complexity: O(n)

Create Strings 1:

Given a string, print all different combinations of it in a lexicographically sorted way. Used recursion. GetPermutations takes freq_array and size of expected string and returns the list of permutations possible. For each possible first letter, it appends it with strings returned by GetPermutaions( freq_array with current char removed, size - 1).  Time Complexity: O(2^n)

Apple Division

Given a bunch of apples and their weights, divide them into two groups such that the difference in weights is minimum. The approach here is to find all possible combinations and get the combination with the minimum absolute difference with total total_sum/2. Using the above algorithm to find the total sum for all combinations. This approach led to Timeout. Apparently, a better approach to code with is by using "Looping with BitMask". 
 
  for (int i = 0; i < 1 << n; i++) {
  	// new combination
	for (int j = 0; j <= n; j++) {
    	    if (1 << j & i) {
        	// included in current combination
            }
        }
  }
In this case, we have to find out the sum of each combination so we just have to maintain a sum var in outer for loop and in the inner loop include each element in sum if the IF condition is satisfied.


 

Write Ahead Logging

In applications that needs to run continuously, ability to recover is key when a component goes down. If the component is transactional in n...