# Data Structure and Algorithm: An Introduction to the Greedy Algorithm

### This tutorial outlines and explains the Greedy Algorithm, complete with codes and infographics that are easy to follow along. You'll be a pro in no time!

· Java Zone · Tutorial
Save
6.72K Views

## 1. Prefix Tree

### 1.1 Description

The prefix tree is related to the greedy algorithm; let's not talk about the relationship first.

Prefix tree is also called Trie, word search tree, etc. It is a tree structure used to store a large number of strings.

Its characteristic is that space is changed for time, and the common prefix of the string is used to reduce the overhead of query time in order to achieve the purpose of improving efficiency.

### 1.2 Classic Prefix Tree

The characters of the classic prefix tree are stored on the path, and each node has no data.

### 1.3 Code Definition

#### 1.3.1 Node Structure

In order to implement certain functions with Trie, some data items are actually added to the node.

``````public class Node {
// 构建前缀树时,该节点被到达过多少次
int pass;
// 该节点是否是某个字符串的结尾节点,如果是,是多少个字符串的结尾节点
int end;
// 各个字符的通路
Node[] nexts; // HashMap<Char, Node> -- TreeMap<Char, Node>

public Node() {
this.pass = 0;
this.end = 0;
// 假如词典中只有'a'-'z',那么next,对应下标0-25
// 通过下级是否有节点判断是否有路
// nexts==null表示下级没有存储'a'的路
// nexts!=null表示下级有存储'a'的路
nexts = new Node;
}
}
``````

Note: In the prefix tree, the downward path of each node is realized by mounting subordinate nodes. In the implementation of the code, the subscripts of the array are used to number the subordinate nodes that can be mounted, so that the one-to-one correspondence between subscripts and characters can be used to realize that each side "carries" different character information.

If you need to store a string containing many types of characters, it is not appropriate to use an array to store the mounted node. For example, Java supports more than 60,000 characters, so you can't open an array with a capacity of 60,000 from the beginning. Therefore, when there are many types of characters, the array can be replaced by a hash table to store the mounted node, and the key of the hash table can also be a one-to-one correspondence with the character.

After replacing the hash table with the array, the algorithm as a whole will not change, and the details of Coding will change.

However, after using hash table storage, the paths are disordered. If you want to make the paths organized like array storage, you can use an ordered table instead of hash table storage.

Use the node to which the data item has been added to store the above string again:

By adding nodes with data items, we can easily solve many problems, such as:

How do I know if the string "bck" is stored in Trie?

Answer: Starting from the root node, check if there is a way to'b'? Yes; is there a way to'c' then? Yes; is there another way to'k'? Yes; then check the end of the node at the end of the'k' path, if end != 0, then "bck" is stored, if end = 0, no "bck" is stored.

How do I know how many strings are prefixed with "ab" among all the strings stored in Trie?

Answer: Starting from the root node, check if there is a way to'a'? Yes; is there a way to'b' then? Yes; then check the pass of the node at the end of the'b' path. The value of pass is the number of strings prefixed with "ab".

How do you know how many strings are stored in Trie?

Answer: Just check the pass of the root node.

How do you know how many empty strings are stored in Trie?

Answer: Just check the end of the root node.

Through the above problems, we can find that using the data item information of the node type, we can easily query each string stored in the Trie, and the cost of querying the string is very low, only need to traverse the number of characters in the string to be queried The number of times is sufficient.

#### 1.3.2 Tree Structure

``````public class Trie {
// 根节点
private Node root;

public Trie() {
this.root = new Node();
}

// 相关操作
...
}

### 1.4 Basic Operation

Idea: Starting from the root node, the pass of the nodes along the way increases by 1, and the end of the node at the end of the string increases by 1.

Code:

``````public void insert(String word) {
if (word == null) {
return ;
}
char[] charSequence = word.toCharArray();
// 字符串起始节点为根节点
Node cur = this.root;
cur.pass ++;
for (int i = 0; i < charSequence.length; i ++) {
int index = charSequence[i] - 'a';
// 该字符对应节点如果没有挂载就挂载上
if (cur.nexts[index] == null) {
cur.nexts[index] = new Node();
}
// 向下遍历
cur = cur.nexts[index];
cur.pass ++;
}
// 记录字符串结尾节点
cur.end ++;
}

Note: Adding every string must start from the root, which means that every string is prefixed with an empty string.

#### 1.4.2 Delete String

Idea: If the string exists, starting from the root node, the pass of the nodes along the way is decremented by 1, and the end of the node at the end of the string is decremented by 1.

Code:

``````public void delete(String word) {
// 判断是否存储该单词
if (word == null || search(word) == 0) {
return ;
}
char[] charSequence = word.toCharArray();
Node cur = this.root;
cur.pass --;
for (int i = 0; i < charSequence.length; i ++) {
int index = charSequence[i] - 'a';
// 当前节点的下级节点在更新数据后pass为0,意味这没有没有任何一个字符串还通过该节点
if (-- cur.nexts[index].pass == 0) {
// 释放掉下级路径的所有节点
cur.nexts[index] = null;
return ;
}
cur = cur.nexts[index];
}
cur.end --;
}

Note: If only one target string is stored in the Trie, after modifying the node data, you need to release all the redundant nodes. Because of the automatic garbage collection in Java, when the pass of a node is 0 when we traverse for the first time, we can directly set it to null, and then all the subordinate nodes of the node will be automatically recycled. If it is implemented in C++, then it is necessary to traverse to the end, and when backtracking along the way, call the destructor to manually release the node.

#### 1.4.3 Query String

Idea: If the string exists, query the end of the node at the end of the string.

Code:

``````// 查询word这个单词被存储的次数
public int search(String word) {
if (word == null) {
return 0;
}
char[] charSequence = word.toCharArray();
Node cur = this.root;
// 只遍历Trie,不更新数据
for (int i = 0; i < charSequence.length; i ++) {
int index = charSequence[i] - 'a';
// 如果没有挂载节点,说明没有存该字符串
if (cur.nexts[index] == null) {
return 0;
}
cur = cur.nexts[index];
}
// 返回字符串末尾节点的end
return cur.end;
}

#### 1.4.4 Query Prefix

Idea: If the string exists, query the pass of the node at the end of the prefix string.

Code:

``````// Trie存储的字符串中,前缀是pre的字符串个数
public int preFixNumber(String pre) {
if (pre == null) {
return 0;
}
char[] charSequence = pre.toCharArray();
Node cur = this.root;
// 只遍历Trie,不更新数据
for (int i = 0; i < charSequence.length; i ++) {
int index = charSequence[i] - 'a';
// 如果没有挂载节点,说明没有字符串以该子串作为前缀
if (cur.nexts[index] == null) {
return 0;
}
cur = cur.nexts[index];
}
// 返回pre子串末尾节点的pass
return cur.pass;
}

## 2. The Greedy Algorithm

### 2.1 Concept

Under a certain standard, the algorithm that gives priority to the samples that most meet the criteria, and finally considers the samples that do not meet the criteria and finally gets an answer is called the greedy algorithm.

In other words, without considering the overall optimality, what is made is a locally optimal solution in a certain sense.

### 2.2 Description

The greedy algorithm is actually the most commonly used algorithm, and the code implementation is also very short.

Many greedy algorithms only require finding a good solution, not an optimal solution. In other words, for most of the daily greedy algorithms, the process from the local optimum to the overall optimum cannot be proved, or the proof is wrong, because sometimes greedy is very subjective. But the greedy algorithm topics we come into contact with are all deterministic and can find the global optimal solution. At this time, the investigation of the greedy algorithm requires proof from the local optimal to the overall optimal.

This article will not show the process of proof, because for each question, how its local optimal strategy derives the proof of the global optimal is different. If every greedy algorithm problem is proved, the time in the interview process is not enough. The following will introduce a very useful technique, the premise of this technique is to prepare a lot of templates, but only need to prepare once. After preparation, when doing greedy algorithm questions in the future, the answer will be quick and accurate, much faster than the proof.

### 2.3 Meeting Issues

Title:

Some projects need to occupy a conference room for presentations, and the conference room cannot accommodate two presentations at the same time. Give you all the projects and the start time and end time of each project, you will arrange the schedule of the presentation, and the conference room is required to have the most presentations. Return to this most preaching session.

Analysis:

Greedy strategy A: The earlier the project starts, the earlier it will be arranged.

The global optimal solution cannot be obtained. The counter-example is as follows:

Greedy strategy B: The shorter the duration of the project, the first arrangement.

The global optimal solution cannot be obtained. The counter-example is as follows:

Greedy strategy C: Projects that end earlier are scheduled first.

The global optimal solution can be obtained.

Counter-examples for Strategy A:

Counter-examples for Strategy B:

Code:

``````public class Solution {

static class Program {
// 项目开始时间
public int begin;
// 项目结束时间
public int end;
public Program(int begin, int end) {
this.begin = begin;
this.end = end;
}
}

// 定义Program比较器,按照Program的结束时间来比较
static class ProgramComparator implements Comparator<Program> {
@Override
public int compare(Program program1, Program program2) {
return program1.end - program2.end;
}
}

public int arrangeProgram(Program[] programs, int currentTime) {
if (programs == null || programs.length == 0) {
return 0;
}
// 可以安排项目的场数
int number = 0;
// 将Program数组按照结束时间早晚来排序,结束早的排前面
Arrays.sort(programs, new ProgramComparator());
for (int i = 0; i < programs.length; i ++) {
// 如果当前时间还没到会议开始时间,安排会议
if (currentTime < programs[i].begin) {
number ++;
}
// 当前时间来到会议结束
currentTime = programs[i].end;
}
return number;
}
}

### 2.4 Problem-Solving Routines

After reading 2.3, there will definitely be questions. Why can the greedy strategy C find the optimal solution? Don't worry about why the greedy strategy C is right, that is, relying on blindness and skillful blindness.

In questions related to greedy algorithms, you will always have several greedy strategies. If you can use counterexamples to overthrow a few greedy strategies, it would be good. But if there are a few greedy strategies that you feel are more reliable, but you can't cite counterexamples, do you need to use strict mathematical proofs? You can do the questions in private, but never during the interview!

The mathematical proof of the greedy strategy is different for each question, because each question has its own business, the method of proof is also incredible, and it tests mathematical ability.

Then, how can you verify that your greedy strategy is correct? Logarithm.

Problem-solving routines:

1. To achieve a solution X that does not rely on greedy strategies, you can use the most violent attempt.
2. The brain makes up for greedy strategy A, greedy strategy B, greedy strategy C...
3. Use the solution X and the logarithm to verify each greedy strategy, and use an experiment to know which greedy strategy is correct.
4. Don't worry about the proof of greedy strategy.

Prepare:

1. Be prepared for brute force attempts or a full array of code templates.
2. Prepare logarithm code template.

### 2.5 Interview Status

Greedy algorithm questions are inflexible, and the proportion of interviews is relatively low. There are about five questions at most one question.

First, the problem of the greedy algorithm cannot test Coding's skills, because the code is extremely simple.

Second, there is no discriminative degree for the problem of the greedy algorithm. As long as the greedy strategy is found right, it will be the same, only the difference between 0 and 1.

## 3. Logarithm

### 3.1 Description

The logarithm is very easy to use and can help you solve many problems.

Assume that method A is to be tested, but the same problem can be implemented with many strategies. If you don’t consider the complexity of the event, I can write a solution that is violently attempted (for example, list all permutations and combinations). We call the method that does not pursue the pros and cons of time complexity but is very thoughtful and easy to write. Method B. Why test method A? Because it may be difficult to think about method A or the time complexity is relatively low.

Before, did every test need to rely on online OJ? If you have to rely on OJ for every topic, do you still have to look for it on OJ when you encounter an unfamiliar topic? If you can't find it, don't you practice it? Secondly, the online test data is what people think, so when he prepares the test case, will it not make your code run wrong, but does not let you pass the situation? You have passed it, but can you guarantee that your code will be correct? Not necessarily, the logarithm method is needed at this time, and the logarithm method is foolproof.

### 3.2 Implementation

Realize a random sample generator, can control the sample size, can control the number of tests, and can generate random data. The generated data is run in method A to get res1, and then run in method B to get res2. Check whether res1 and res2 are consistent. You measure tens of thousands of groups and then change the sample size. When you find that res1 and res2 are inconsistent, then either method A is wrong, or method B is wrong, or both are wrong.

The realization of random numbers in Java:

``````// [0,1)上所有小数，等概率返回一个，double类型
random = Math.random();
// [0,N)上所有小数，等概率返回一个，double类型
random = N * Math.random();
// [0,N-1]上所有整数，等概率返回一个，int类型
random = (int) (N * Math.random());

By generating random numbers, you can make any part of the test sample random, such as: sample size, number of tests, test data.

The business logic of each topic is different, and the implementation of the logarithm is also different. You need to implement it according to the actual situation.

## 4. Interview Questions

### 4.1 Finding the Median

Title:

The user uses a structure to store N numbers one by one, and requires that the median of all numbers currently stored in the structure can be found at any time.

Rule: The median of odd-numbered digits is the middle digit, and the median of even-numbered digits is the average of the middle two digits.

Analysis:

This topic has nothing to do with the greedy algorithm, and is a classic topic for investigating the application of the reactor.

Because the heap is widely used in the greedy algorithm, you can familiarize yourself with the operation of the heap through this topic.

The time complexity of this algorithm is very low because all operations on the heap are O(logN).

The process is:

1. Prepare a large root pile and a small root pile.
2. The first number enters the big root pile.
3. Enter the fixed iterative process:
4. Whether the current input number cur <= the top number of the big root pile, if it is, cur enters the big root pile; if not, cur enters the small root pile.
5. Observe the size of the big root pile and the small root pile. If the larger one has 2 more than the smaller ones, the top of the larger one pops into the other.
6. Stop iteration after all numbers are stored.
7. The smaller N/2 numbers are in the large root pile, and the larger N/2 numbers are in the small root pile. The median can be found by using the top numbers of the two piles.

Code:

``````public class Solution {

// 大根堆比较器
static class BigComparator implements Comparator<Integer> {
@Override
public int compare(Integer num1, Integer num2) {
return num2 - num1;
}
}

public static int findMedian(int[] input) {
if (input == null || input.length == 0) {
return 0;
}
PriorityQueue<Integer> bigRootHeap = new PriorityQueue<>(new BigComparator());
PriorityQueue<Integer> smallRootHeap = new PriorityQueue<>();
// 第一个数先入大根堆
for (int i = 1; i < input.length; i ++) {
if (input[i] <= bigRootHeap.peek()) {
} else {
}
if (Math.abs(bigRootHeap.size() - smallRootHeap.size()) == 2) {
if (bigRootHeap.size() > smallRootHeap.size()) {
} else {
}
}
}
// 判断输入数字个数是奇数还是偶数
if (input.length % 2 != 0) {
return smallRootHeap.peek();
}
return (bigRootHeap.peek() + smallRootHeap.peek()) / 2;
}
}

### 4.2 Gold Bar Problem

Title:

Cutting a gold bar in half requires a copper plate with the same length as the length. For example, a strip with a length of 20 will cost 20 copper plates no matter how long it is cut into two halves.

A group of people wants to divide the whole gold bar, how to divide the most economical copper plate?

For example, given an array [10, 20, 30], representing a total of three people, the length of the whole gold bar is 10 + 20 + 30 = 60. Gold bars are divided into three parts: 10, 20, and 30. If you divide the 60-length gold bar into 10 and 50, it costs 60; then divide the 50-length gold bar into 20 and 30, and it costs 50; it costs 110 copper plates in total.

But if you divide the 60-length gold bar into 30 and 30, it costs 60; then divide the 30-length gold bar into 10 and 20, and it costs 30; a total of 90 copper plates are spent.

Enter an array and return the minimum cost of splitting.

Analysis:

This problem is the classic Huffman coding problem.

The process is:

1. Put all the elements in the array into the small root heap
2. Iterative fixed process:
3. Pop two nodes from the top of the small root pile and combine them into new nodes to construct a Huffman tree.
4. The new node is put back into the small root heap.
5. Stop iterating until there is only one node in the small root heap.

Code:

``````public int leastMoney(int[] parts) {
PriorityQueue<Integer> smallRootHeap = new PriorityQueue<>();
// 需要花费的最少钱数
int money = 0;
// 将节点全部放入小根堆
for (int i = 0; i < parts.length; i ++) {
}
// 直到堆中只有一个节点时停止
while (smallRootHeap.size() != 1) {
// 每次堆顶弹两个算累加和
int cur = smallRootHeap.poll() + smallRootHeap.poll();
money += cur;
// 累加和的新节点入堆
}
return money;
}

### 4.3 Project Planning Issues

Title:

Your team has received some projects, and each project will have cost casts and profits. Since your team has very few people, you can only do projects serially. Given that your team’s current disposable funds are M and can only do K projects at most, what are the final maximum disposable funds?

Note: casts[i], profits[i], M and K are all positive numbers.

Analysis:

The process is:

1. Build a small root pile and a large root pile. The sorting criterion for small root piles is cost, and the sorting criterion for large root piles is profit.
2. Put all the items into the small root pile.
3. Enter the fixed iterative process:
4. The small root pile pops all items with a cost less than or equal to M into the big root pile.
5. The most profitable projects are popped up.
6. M plus the profit of the project just completed.
7. Stop iterating until the number of completed projects is K.

Code:

``````public class Solution {

static class Project {
int cost;
int profit;
public Project(int cost, int profit) {
this.cost = cost;
this.profit = profit;
}
}

static class minCostComparator implements Comparator<Project> {
@Override
public int compare(Project p1, Project p2) {
return p1.cost - p2.cost;
}
}

static class maxProfitComparator implements Comparator<Project> {
@Override
public int compare(Project p1, Project p2) {
return p2.profit - p1.profit;
}
}

public static int findMaximumFund(int M, int K, int[] costs, int[] profits) {
if (M == 0 || K == 0) {
return 0;
}
// 通过花费构建小根堆
PriorityQueue<Project> costSmallRootHeap = new PriorityQueue<>(new minCostComparator());
// 通过利润构建大根堆
PriorityQueue<Project> profitBigRootHeap = new PriorityQueue<>(new maxProfitComparator());
// 将所有项目全部放入小根堆
for (int i = 0; i < costs.length; i ++) {
}
// 一共只能做K个项目
for (int i = 0; i < K; i ++) {
// 将小根堆中当前可以做的项目放入大根堆
while (!costSmallRootHeap.isEmpty() && costSmallRootHeap.peek().cost <= M) {
}
// 没有可以做的项目
if (profitBigRootHeap.isEmpty()) {
return M;
}
// 从大根堆中选选取利润最大的做
Project cur = profitBigRootHeap.poll();
M += cur.profit;
}
return M;
}
}

### 4.4 N Queen Problem

Title:

The N queen problem refers to placing N queens on the N*N chessboard, requiring any two queens to be in different rows, different columns, and not on the same diagonal line.

Given an integer n, return how many ways there are for queen n.

E.g:

n=1, return 1.

n=2 or 3, return 0. (The question of 2 queens and 3 queens will not work no matter how you put them)

n=8, return 92.

Analysis:

This problem is a classic problem, and the optimal solution is very complex, which is a dynamic programming problem with aftereffect.

If you are not writing a thesis, the best solution in the interview process is to use the depth-first idea, put the queen in each row in turn, and use violent recursion to try every possibility in each column.

The time complexity index of this solution is still very high.

Because there are N choices in the first row, N choices in the second row, and N choices in the third row, ..., there are N rows in total, so the time complexity is O(N^N).

Suppose, record = 2, record = 4 when the depth-first traversal reaches i = 2, there are three reasonable placement positions for Queen in the third line of the figure, and the three positions are then depthwise sequentially Prioritize traversal, and start again and again.

Code:

``````public static int nQueen(int n) {
if (n < 1) {
return 0;
}
int[] record = new int[n];
// 从第0行开始
return process(0, n, record);
}

/**
* @param i 当前遍历第几行
* @param n 一共多少行
* @param record [0,i-1]行已经摆放的Queen的位置,record=2表示第1行第2列已摆放一个Queen
* @return n行n列棋盘中摆n个Queen的合理的摆法总数
*/
public static int process(int i, int n, int[] record) {
// 遍历到最后一行的下一行结束递归,说明这条摆放方案合理
if (i == n) {
return 1;
}
// 记录[i,n-1]行合理的摆法总数
int result = 0;
// 尝试第i行的[0,n-1]列进行深度优先遍历
for (int j = 0; j < n; j ++) {
// 判断在第i行第j列的位置上是否能放Queen
if (isValid(i, j, record)) {
record[i] = j;
// 遍历下一行
result += process(i + 1, n, record);
}
}
return result;
}

// 检查第i行第j列能不能放Queen
public static boolean isValid(int i, int j, int[] record) {
// 遍历[0,i-1]行放过的所有Queen,检查是否和当前位置有冲突
for (int k = 0; k < i; k ++) {
// 判断是否是同一列或者是否共斜线(不可能共行)
if (record[k] == j || Math.abs(k - i) == Math.abs(record[k] - j)) {
return false;
}
}
return true;
}

### 4.5 N Queen Problem (Optimization)

Although the N-queen problem cannot be optimized in terms of time complexity, it can be optimized in terms of constants, and there are still many optimizations.

It can be said that the time complexity is still such a time complexity, but I can make it very low in constant time during the implementation process.

How low is it? For example, for a 14-queen problem, the solution with 4.4 will run for 5s, and the solution optimized with 4.5 will run for 0.2s. For a 15-queen problem, the 4.4 solutions will run for 1 minute, and the 4.5 optimized solutions will run for 1.5s.

Analysis:

Use bit operations to speed up. Bit arithmetic acceleration is a very common technique, it is recommended to master it.

Since bitwise operation is used, it is related to the storage form of the variables in the code. The type of the variables in this code is a 32-bit int, so the problem of 32 queens or more cannot be solved.

If you want to solve the problem of more than 32 queens, you can change the parameter type to long.

Code:

``````public static int nQueen(int n) {
if (n < 1 || n > 32) {
return 0;
}
int limit = n == 32 ? -1 : (1 << n) - 1;
return process(limit, 0, 0, 0);
}

/**
* @param n 一共多少行
* @param colLim 列的限制，1的位置不能放皇后，0的位置可以
* @param leftDiaLim 左斜线的限制，1的位置不能放皇后，0的位置可以
* @param rightDiaLim 右斜线的限制，1的位置不能放皇后，0的位置可以
* @return  n行n列棋盘中摆n个Queen的合理的摆法总数
*/
public static int process(int n, int colLim, int leftDiaLim, int rightDiaLim) {
// 皇后是否填满
if (colLim == n) {
return 1;
}
int mostRightOne = 0;
// 所有后选皇后的列都在pos的位信息上
int pos = n & (~ (colLim | leftDiaLim | rightDiaLim));
int res = 0;
while (pos != 0) {
// 提取出候选皇后中最右侧的1
mostRightOne = pos & (~ pos + 1);
pos = pos - mostRightOne;
// 更新限制，进入递归
res += process(n, colLim | mostRightOne,
(leftDiaLim | mostRightOne) << 1,
(rightDiaLim | mostRightOne) >>> 1);
}
return res;
}

Topics:
datastructure, alogorithm, java, tutorial, greedy algorithm

Published at DZone with permission of shaoyang liu.

Opinions expressed by DZone contributors are their own.