Steps to go through to find a solution

Understand the problem
Think about constraints, assumptions, and requirements
Break the problem into subproblems (what algo can solve this kind of problem)
Come up with simple examples and try to find a pattern in the way it’s being solved

Asymptotic complexity

T(n) = aT(n/b) + O(n^d) where:

a is the number of subproblems,
b is the factor by which the problem size is reduced,
d represents the exponent of the additional work done outside the recursive calls. Compare $n^{d}$ vs. $n^{l o g_{b} a}$ (a.k.a: $d$ vs. $lo g_{b} a$ ):
Case 1: If $d < lo g_{b} a$ , then recursion dominates: T(n) = $O (n^{l o g_{b} a})$
Case 2: If $d > lo g_{b} a$ , then non-recursive work dominates: T(n) = O(n^d)
Case 3: If $d = lo g_{b} a$ , then they are balanced, and we get an extra logarithmic factor: T(n) = $O (n^{d} lo g n)$

Problem solving with MergeSort

Count inversions (a, b) with a ∈ A and b ∈ B, assuming A and B are sorted.

Scan A and B from left to right.
Compare ai and bj.
If ai < bj, then ai is not inverted with any element left in B.
If ai > bj, then bj is inverted with every element left in A.
Append smaller element to sorted list C.

How to give a greedy algo

Give the heuristic: We state the greedy choice, which is also called the greedy heuristic. A well-designed greedy algorithm is often phrased in just 2–3 sentences (see Lectures).
Show the algorithm produces a valid output: We show that our greedy heuristic produces a valid solution. If the heuristic is well-phrased, this is often straightforward and often only takes 1–3 sentences.
Argue that the greedy heuristic produces an optimal solution: We of course need to argue that the algorithm we proposed actually works. An algorithm works if it outputs an op-timal solution on every valid input. We thus need to show that the value of the solution produced by our algorithm is no worse than the value of any valid solution. A canonical, clean and insightful argumentation for this is typically a proof by contradiction.
Give the complexity of the algorithm: This is typically straightforward and often only takes 2–3 sentences. Typical greedy algorithms can clearly be implemented in a polynomial number of steps, and often that suffices in terms of our analysis of the complexity.

Greedy justification

Let vmin be a vertex of the smallest degree dmin in G. It is never a bad choice to let vmin be labelled 1.
The reason being that some vertex in G must be labelled by 1, and its outdegree in the directed graph ˆG will be its degree in G, which is at least dmin. After removing vmin, we have a subgraph of G, and the value of a subgraph of a graph H is never worse than the value of the graph H itself. thought process:
State your greedy heuristic in a strong manner that shows that every decision we take is the best decision we can possibly take
clear argument on why your single choice is correct
“Our greedy algo picks a choice, and that choice picks the optimal solution, so there is no solution that is better than our choice
“The green and purple edges jointly constitute a MST T . The green edge g11 costs at least as much as the red edge g8, since the red edge is the greedy choice, and our greedy algorithm picks a cheapest edge. If we swap edge g11 out for g8, we have a spanning tree T′, and the cost of T′ is no more than the cost of T . So T′ must also be a MST”

C. Dynamic Programming (DP)

How to Formulate a DP Solution:

Identify Overlapping Subproblems:

• If a recursive solution recomputes the same values, DP is likely applicable.

Define the State:

• Determine what each DP table entry (or memoized function) represents. Example: For knapsack, dp[i][w] might represent the maximum value achievable with the first i items and a capacity w.

Develop the Recurrence:

• Express the solution to a problem in terms of solutions to smaller subproblems.

Determine Base Cases:

• Clearly define the simplest instances (e.g., dp[0][*] = 0).

Choose an Implementation Strategy:

• Top-Down (Memoization): Recursively compute values and cache results. • Bottom-Up (Tabulation): Iteratively fill in a DP table.

Reconstruct the Solution (if necessary):

• Use auxiliary pointers or backtracking through your DP table.

Common DP Problems & Approaches:

• Knapsack Problem: State: dp[i][w] = maximum value using items 1 to i with capacity w. Recurrence: dp[i][w] = max(dp[i-1][w], value[i] + dp[i-1][w-weight[i]]).

• Longest Common Subsequence (LCS): State: dp[i][j] = length of LCS for prefixes X[1..i] and Y[1..j]. Recurrence: If X[i] == Y[j] then dp[i][j] = 1 + dp[i-1][j-1] else dp[i][j] = max(dp[i-1][j], dp[i][j-1]).

• Longest Increasing Subsequence (LIS): State: dp[i] = length of the longest increasing subsequence ending at index i. Recurrence: dp[i] = 1 + max{ dp[j] : j < i and array[j] < array[i] }.

LSS

In many divide‐and‐conquer (or segment tree) solutions to the Largest Subarray Sum problem, the recursive function returns a tuple of four values—often labeled as , , , and —each of which encapsulates important information about the subarray. Here’s what each one typically represents:

• : The Maximum Subarray Sum

This is the best (largest) sum obtainable by any contiguous subarray within the current segment.

• : The Maximum Prefix Sum

This is the largest sum of a subarray that starts at the beginning of the segment. In other words, it’s the best sum you can get by taking a prefix of the current segment.

• : The Maximum Suffix Sum

This is the largest sum of a subarray that ends at the end of the segment. It represents the best sum obtainable by taking a suffix of the segment.

• : The Total Sum of the Segment

This is simply the sum of all the elements in the current segment.

Why These Four Values?

When you split the array into two halves, these four values allow you to efficiently merge the solutions from the left and right segments. For instance, if you have two segments with tuples and , you can combine them as follows:

• Total Sum:

• Maximum Prefix Sum:

• Maximum Suffix Sum:

• Maximum Subarray Sum:

This merging ensures that every possible subarray (entirely in the left segment, entirely in the right segment, or crossing the midpoint) is considered.

In Summary

• is the overall best (largest) subarray sum in the segment.

• is the best sum from the start of the segment to some position.

• is the best sum from some position to the end of the segment.

• is the total sum of the segment.

These values are crucial for building an efficient recursive (or segment tree) solution to the Maximum Subarray Sum problem.

Below is a concise, high-level explanation of how one might reason about the problem—without revealing every intermediate step—so you can see the main ideas that lead to the solution.

1. Realizing There Can Be Only a Few Frequent Elements

Key Insight: At most 6 frequent elements
An integer x is called frequent if it appears at least ⌈n/6⌉ times in the sorted array A[1..n]. Observe that you cannot have more than 6 distinct frequent elements. Indeed, if you had 7 different elements each appearing ⌈n/6⌉ times, you would exceed nnn. Thus, at most 6 elements can qualify.
Why sampling works
Because AAA is sorted, any frequent element x forms a contiguous block of at least ⌈n/6⌉ identical values. Such a block must “cross” one of a small number of carefully chosen positions.
Key Observation: An element is “frequent” if it appears at least ⌈n/6⌉ times in the array of length nn.
If you tried to have 7 or more distinct elements each meeting that threshold, you would exceed nn. Therefore, there can be at most 6 frequent elements.

2. Exploiting the Sorted Structure

Why Sorting Helps: In a sorted array, all occurrences of a given value appear in one contiguous block. This means if an element appears ⌈n/6⌉ times, that block of identical values must be long enough to “cross” certain key positions in the array.

3. Pinpointing Candidates via Sample Positions

Sampling Idea: Divide the array into 6 (approximately) equal parts. If an element appears in at least ⌈n/6⌉positions, then its block of occurrences must overlap one of these 6 dividing points.
Thus, you only need to look at the values at those dividing points (there are at most 6 such positions) as potential frequent elements.

4. Confirming Candidates with Binary Search

Check Each Candidate: Once you have at most 6 candidate values, you can do two binary searches for each candidate:
1. One to find where the candidate first appears.
2. Another to find where the candidate last appears.
The difference of those two indices (plus one) gives you the candidate’s total frequency. If it meets or exceeds ⌈n/6⌉, it is truly frequent.

5. Time Complexity

Sampling those 6 positions takes constant time.
Checking each candidate with two binary searches takes O(log⁡n)time per candidate, for a total of O(log⁡n)across all candidates.
Hence, the algorithm is both correct (it finds all frequent elements and only those) and efficient (runs in O(log⁡n)time).

Why This Works

No Missed Elements: A frequent element must appear at one of the 6 sampled positions, so you do not miss any.
No Extra Elements: You verify each candidate by counting, so only truly frequent ones are reported.

This approach leverages the interplay between the frequency threshold (⌈n/6⌉) and the sorted nature of the array. It systematically narrows down the possibilities to at most 6 checks, each confirmed via fast lookups (binary search).

Rami's Universe

Explorer

CPSC413 middy notes