# 8. Non-deterministic Polynomial Completeness – Formal Languages and Automata Theory

## Non-deterministic Polynomial Completeness

1. NP-hard and NP-complete define the computational complexity of algorithms.

In this chapter, we introduce the theory of intractability, that is, discuss whether a problem can be solved in polynomial time or not, that is, P, Non-deterministic Polynomial (NP) Problems are discussed. The classification of problems into NP-hard problems and NP-complete (NPC) problems is explained with examples.

#### 8.1.1 Classification of Problems

The subject of computational complexity theory is focused on classifying problems by how hard they are. There are many different classifications depending on the time taken by the problem. The following are the types of classification.

1. P problems are those that can be solved by a Turing machine (TM) (deterministic) in polynomial time. (‘P’ stands for polynomial). P problems form a class of problems that can be solved efficiently.
2. NP problems are those that can be solved by non-deterministic TM in polynomial time. A problem is in NP if you can quickly (in polynomial time) test whether a solution is correct (without worrying about how hard it might be to find the solution). NP problems are a class of problems that cannot be solved efficiently. NP does not stand for ‘non-polynomial’. There are many complexity classes that are much harder than NP.
3. Undecidable problems: For some problems, we can prove that there is no algorithm that always solves them, no matter how much time or space is allowed. One very uninformative proof of this is based on the fact that there are as many problems as there real numbers, and only as many programs as there are integers. So, there are not enough programs to solve all the problems. But we can also define explicit and useful problems which cannot be solved.

#### 8.2 P Problems

In computational complexity theory, P, also known as PTIME or DTIME, is one of the most fundamental complexity classes. It contains all decision problems that can be solved by a deterministic TM using a polynomial amount of computation time or, simply put, polynomial time. P is known to contain many natural problems, including the decision versions of linear programming calculating the greatest common divisor and finding a maximum matching.

#### 8.3 NP Problems

In computational complexity theory, NP is one of the most fundamental complexity classes.

Intuitively, NP is the set of all decision problems for which the ‘yes’ answers have simple proofs because of the fact that the answer is indeed ‘yes’. More precisely, these proofs have to be verifiable in polynomial time by a deterministic TM. In an equivalent formal definition, NP is the set of decision problems solvable in polynomial time by a non-deterministic TM. The Figure 8.1 indicates the relation between P and NP problems. Fig. 8.1 Class of P and NP Problems

The complexity class P is contained in NP, but NP contains many important problems, called NPC problems, for which no polynomial-time algorithms are known. The most important open question in complexity theory, the P = NP problem, asks whether such algorithms actually exist for NPC problems. It is widely believed that this is not the case.

#### 8.4 Tractable Problems

The problems that can be solved in polynomial time are called tractable problems. The problems that cannot be solved in polynomial time are called intractable problems. It is believed that no efficient algorithm exists for any of the decision problems. So they are considered to be intractable.

#### 8.5 NP-complete

In computational complexity theory, the complexity class NPC is a class of problems having two properties:

1. Solution to the problem in this class can be verified quickly (in polynomial time); the set of problems with this property is called NP.
2. If the problem in this class can be solved quickly (in polynomial time), then so can every problem in NP.

The most notable characteristic of NPC problems is that no fast solution to them is known; that is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows. As a result, the time required to solve even moderately large versions of many of these problems easily reaches billions or trillions of years, using any amount of computing power available today. As a consequence, determining whether or not it is possible to solve these problems quickly is one of the principal unsolved problems in Computer Science today.

While a method for computing the solutions to NPC problems using a reasonable amount of time remains undiscovered, computer scientists and programmers still frequently encounter NPC problems. An expert programmer should be able to recognize an NPC problem so that he or she does not unknowingly waste time trying to solve a problem that so far has eluded generations of computer scientists. Hence, NPC problems are often addressed using approximation algorithms in practice.

NPC is a subset of NP, the set of all decision problems whose solutions can be verified in polynomial time; NP may be equivalently defined as the set of decision problems that can be solved in polynomial time on a non-deterministic TM. A problem p in NP is also in NPC if and only if every other problem in NP can be quickly transformed into p. (The word ‘quickly’ in these statements refers to polynomial time; a more formal definition of NPC is given below. NPC can also be used as an adjective: Problems in the class NPC are known as NPC problems.

NPC problems are studied because the ability to quickly verify solutions to a problem (NP) seems to correlate with the ability to quickly solve that problem (P). It is not known whether every problem in NP can be quickly solved – this is called the P = NP problem. But if any single problem in NPC can be solved quickly, then every problem in NP can also be quickly solved, because the definition of an NPC problem states that every problem in NP must be quickly reducible to every problem in NP-complete. Because of this, it is often said that the NPC problems are ‘harder’ or ‘more difficult’ than NP problems in general.

A decision problem C is NPC if

1. C is in NP, and
2. Every problem in NP is reducible to C.

C can be shown to be in NP by demonstrating that a candidate solution to C can be verified in polynomial time.

A problem K is reducible to problem C if there is a polynomial-time many–one reduction, a deterministic algorithm that transforms instances k ∈ K into instances c ∈ C, such that the answer to c is YES if and only if the answer to k is YES. To prove that an NP problem A is in fact an NPC problem, it is sufficient to show that an already known NPC problem reduces to A.

Note that a problem satisfying Condition 2 is said to be NP-hard, whether or not it satisfies Condition 1.

A consequence of this definition is that if we had a polynomial-time algorithm (on a UTM, or any other Turing-equivalent abstract machine) for C, we could solve all problems in NP in polynomial time.

#### 8.6 NP-hard

Venn diagram for P, NP, NP-complete, and NP-hard set of problems is shown in Figure 8.2. Fig 8.2 Relation Between NP-hard and NP Complete

NP-hard (non-deterministic polynomial-time hard), also called computational complexity theory, is a class of problem that are informally ‘at least as hard as the hardest problem in NP’. A problem H is NP-hard if and only if there is an equivalent NPC problem L that is polynomial time Turing-reducible to H, that is, L ≤ TH. In other words, L can be solved in polynomial time by an oracle machine with an oracle for H. Informally, we can think of an algorithm that can call such an oracle machine as subroutine for solving H, and solves L in polynomial time if the subroutine call takes only one step to compute. NP-hard problems may be of any type: decision problems, search problems, optimization problems. We note

1. Problem H is at least as hard as L, because H can be used to solve L.
2. Since L is NP-complete, and hence the hardest in class NP, also problem H is at least as hard as NP, but H does not have to be in NP and hence does not have to be a decision problem (even if it is a decision problem, it need not be in NP).
3. Since NPC problems transform to each other by polynomial-time many–one reduction (also called polynomial transformation), all NPC problems can be solved in polynomial time by a reduction to H. Hence, all problems in NP reduce to H; note however, that this involves combining two different transformations: from NPC decision problems to NPC problem L by polynomial transformation, and from L to H by polynomial Turing reduction.
4. If there is a polynomial algorithm for any NP-hard problem, then there are polynomial algorithms for all problems in NP, and hence P = NP.
5. If P ≠ NP, then NP-hard problems have no solutions in polynomial time, while P = NP does not resolve whether the NP-hard problems can be solved in polynomial time.
6. If an optimization problem H has an NPC decision version L, then H is NP-hard.
7. If H is in NP, then H is also NPC because in this case the existing polynomial Turing transformation fulfils the requirements of polynomial-time transformation.

A common mistake is to think that the ‘NP’ in ‘NP-hard’ stands for ‘non-polynomial’. Although it is widely suspected that there are no polynomial-time algorithms for these problems, this has never been proved.

An example of an NP-hard problem is the following decision problem called SUBSET-SUM: Given a set of integers, does any non-empty subset of them add up to zero? That is a yesno question, and happens to be NP-complete. Another example of an NP-hard problem is the optimization problem of finding the least-cost route through all nodes of a weighted graph. This is commonly known as the Travelling Salesman Problem (TSP).

There are also decision problems that are NP-hard, but not NP-complete; for example, the halting problem. This is the following problem: ‘Given a program and its input, decide if it will run forever?’ That’s a yesno question. So this is a decision problem. It is easy to prove that the halting problem is NP-hard, but not NP-complete. For example, the Boolean satisfaction problem can be reduced to the halting problem by transforming it to the description of a TM that tries all truth value assignments and when it finds one that satisfies the formula, it halts; otherwise, it goes into an infinite loop. It is also easy to see that the halting problem is not in NP since all problems in NP are decidable in a finite number of operations, while the halting problem, in general, is not. But not all NP-Hard problems that are not NPC are undecidable. For instance, the language of True quantified Boolean formulas is decidable in polynomial space, but probably not in non-deterministic polynomial time.

An alternative definition of NP-hard that is often used restricts NP-Hard to decision problems and then uses polynomial-time many–one reduction instead of Turing reduction. So, formally, a language L is NP-hard if ∀L′ ∈ NP, L′ ≤p L. If it is also the case that L is in NP, then L is called NP-complete.

#### 8.7 Examples of Problems in Different Classes

Example 8.1

Long simple paths

A simple path in a graph is just one without any repeated edges or vertices. To describe the problem of finding long paths in terms of complexity theory, we need to formalize it as a yes-or-no question: Given a graph G, vertices s and t, and a number k, does there exist a simple path from s to t with at least k edges? A solution to this problem would then consist of such a path.

Why is this in NP? If you are given a path, you can quickly look at it and add up the length, double-checking that it really is a path with length at least k. This can be done in linear time; so certainly it can be done in polynomial time.

However, we do not know whether this problem is in P, as no procedure exists to find a best path (with time polynomial in m, n, and K). As one has to explore all possible ways from every possible node treating it as a possible start state or a possible end state, we believe that no such algorithm exists. This makes the problem NP-complete.

There are algorithms that solve the problem above. For instance, the following is such an algorithm: List all 2m subsets of edges, and check whether any of them solves the problem. But as far as we know, there is no algorithm that runs in polynomial time.

Example 8.2

Cryptography

Suppose we have an encryption function, for example,

Code = RSA (key, text)

The ‘RSA’ encryption works by performing some simple integer arithmetic on the code and the key, which consists of a pair (p, q) of large prime numbers. One can perform the encryption only knowing the product pq; but to decrypt the code you instead need to know a different product, (p − 1)(q − 1).

A standard assumption in cryptography is the ‘known plaintext attack’: We have the code for some message, and we know (or can guess) the text of that message. We want to use that information to discover the key, so we can decrypt other messages sent using the same key.

Formalized as an NP problem, we simply want to find a key for which code = RSA (key, text). If you are given a key, you can test it by doing the encryption yourself: so this is in NP.

The hard question is, how do you find the key? For the code to be strong, we hope it is not possible to do much better than a brute force search.

Another common use of RSA involves ‘public key cryptography’: A user of the system publishes the product pq, but does not publish p, q or (p − 1)(q − 1). That way anyone can send a message to that user by using the RSA encryption, but only the user can decrypt it. Breaking this scheme can also be thought of as a different NP problem: Given a composite number pq, find a factorization into smaller numbers.

One can test a factorization quickly (just multiply the factors back together again). So the problem is in NP. Finding a factorization seems to be difficult, and we think it may not be in P. However, there is some strong evidence that it is not NPC either. It seems to be one of the (very rare) examples of problems between P and NPC in difficulty.

Example 8.3

Chess

We have seen in the news recently a match between the world chess champion Gary Kasparov and a very fast chess computer Deep Blue. The computer lost the match, but won one game and tied others.

What is involved in chess programming? Essentially, the sequences of possible moves form a tree: The first player has a choice of 20 different moves (most of which are not very good), after each of which, the second player has a choice of many responses, and so on. Chess-playing programs work by traversing this tree finding what the possible consequences of each different move would be.

The tree of moves is not very deep–a typical chess game might last 40 moves, and it is rare for one to reach 200 moves. Since each move involves a step by each player, there are at most 400 positions involved in most games. If we traversed the tree of chess positions only to that depth, we would need enough memory to store the 400 positions on a single path at a time. This much memory is easily available on the smallest computers you are likely to use.

So perfect chess playing is a problem in PSPACE. (Actually one must be more careful in definitions. There are only a finite number of positions in chess. So, in principle you could write down the solution in constant time. But that finite number would be very large. Generalized versions of chess on larger boards are in PSPACE).

The reason this deep game-tree search method cannot be used in practice is that the tree of moves is very bushy, so that, even though it is not deep, it has an enormous number of vertices. We will not run out of space if we try to traverse it, but we will run out of time before we get through even a small fraction of the way. Some pruning methods, notably ‘alpha–beta search’, can help reduce the portion of the tree that needs to be examined, but they are not enough to solve this difficulty. For this reason, actual chess programs, instead, only search a much smaller depth (such as up to 7 moves), at which point they do not have enough information to evaluate the true consequences of the moves and are forced to guess by using heuristic ‘evaluation functions’ that measure simple quantities such as the total number of pieces left.

Example 8.4

Halting problem

Suppose you are working on a lab for a programming class, suppose you have written your program, and also suppose you started to run it. After five minutes, it is still going. Does this mean it is in an infinite loop, or is it just slow?

It would be convenient if your compiler could tell you that your program has an infinite loop. However, this is an undecidable problem: There is no program that will always correctly detect infinite loops.

Some people have used this idea as evidence that people are inherently smarter than computers, since it shows that there are problems computers cannot solve. However, it is not clear to me that people can solve them either. Here’s an example:

Main ( )

{

int x = 3;

for (;;) {

for (int a = 1; a < = x; a + +)

for (int b = 1; b < = x; b + +)

for (int c = 1; c < = x; c + +)

for (int i = 3; i < = x; i + +)

if (pow(a, i) + pow(b, i) = = pow(c, i))

exit;

x++;

}

}

This program searches for solutions to Fermat’s last theorem. Does it halt? (You can assume I am using a multiple-precision integer package instead of built-in integers, so you do not have to worry about arithmetic overflow complications.) To be able to answer this, you have to understand the recent proof of Fermat’s last theorem. There are many similar problems for which no proof is known; so we are clueless whether the corresponding programs halt.

#### 8.8 NP-completeness

The theory of NP-completeness is a solution to the practical problem of applying complexity theory to individual problems. NPC problems are defined in a precise sense as the hardest problems in P. Even though we do not know whether there is any problem in NP that is not in P, we can point to an NPC problem and say that if there are any hard problem in NP, that problem is one of the hard ones.

(Conversely if everything in NP is easy, then all problems are easy. So NP-completeness can be thought of as a way of making the big P = NP question equivalent to smaller questions about the hardness of individual problems.)

So if we believe that P and NP are unequal, and we prove that some problem is NP-complete, we should believe that it does not have a fast algorithm.

For unknown reasons, most problems we have looked at in NP turn out either to be in P or NP-complete. So the theory of NP-completeness turns out to be a good way of showing that a problem is likely to be hard, because it applies to a lot of problems. But there are problems that are in NP, that are not known to be in P, and are not likely to be NP-complete. Example: The halting problem given before.

#### 8.9 Reduction

Formally, NP-completeness is defined in terms of ‘reduction’, which is just a complicated way of saying one problem is easier than another.

We say that A is easier than B (and write A < B) if we can write down an algorithm for solving A that uses a small number of calls to a subroutine for B (with everything outside the subroutine calls being fast, polynomial time). There are several minor variations of this definition depending on the detailed meaning of ‘small’–it may be a polynomial number of calls, a fixed constant number or just one call.

Then if A < B and B is in P, so is A: We can write down a polynomial algorithm for A by expanding the subroutine calls to use the fast algorithm for B.

So, ‘easier’ in this context means that if one problem can be solved in polynomial time, so can the other. It is possible for the algorithms for A to be slower than those for B, even though A < B.

As an example, consider the Hamiltonian cycle problem: Does a given graph have a cycle visiting each vertex exactly once? Here is a solution, using the longest path as a subroutine:

for each edge (u, v) of G

if there is a simple path of length n − 1 from u to v

return yes // path + edge form a cycle

return no

This algorithm makes m calls to a longest path subroutine, and does O (m) work outside those subroutine calls. So it shows that Hamiltonian cycle < longest path. (It does not show that Hamiltonian cycle is in P, because we do not know how to solve the longest path sub-problems quickly.)

As a second example, consider a polynomial-time problem such as the minimum spanning tree. For every other problem B, B < minimum spanning tree, since there is a fast algorithm for minimum spanning trees using a subroutine for B. (We do not actually have to call the subroutine, or we can call it and ignore its results.)

Example 8.5

Fifteen puzzle problem

The n-puzzle is a sliding puzzle that consists of a frame of numbered square tiles in random order with one tile missing. If the size of puzzle is 3 × 3, it is called the 8-puzzle or 9-puzzle; and if size is 4 × 4, the puzzle is called the 15-puzzle or 16-puzzle. For the puzzle with initial configuration in Figure 8.3, the object of the puzzle is to place the tiles in order as shown in Figure 8.4, by making sliding moves that use the empty space. Fig. 8.3 Initial Configuration Fig. 8.4 Final Configuration

The n-puzzle is a classical problem for modelling algorithms involving heuristics. A simple heuristics that can be used for estimating the time to solve this problem are

1. Count the number of misplaced tiles
2. Find the sum of the Manhattan distances between each block and its position in the goal configuration

Both heuristics are admissible, that is, they never overestimate the number of moves left, which ensures optimality for certain search algorithms.

A simple parity argument shows that half of the starting states for the n-puzzle are impossible to resolve, no matter how many moves are made. This is because some states may be having invalid moves or some times the states are repeated. The possible states can be divided into two equivalence classes and they can be labelled as reachable or unreachable states.

The time complexity includes the moves made in permutations of all 16 squares and the distance moved by the empty square. Each move changes the parity of the permutation and the parity distance. If the empty square is not moved, the permutation of the remaining pieces must be even. The same works for all rectangular boards or, more generally, for all boards with no odd cycles.

For larger versions of the n-puzzle, finding a solution is easy, but the problem of finding the shortest solution is NP-hard. For the 15-puzzle, lengths of optimal solutions range from 0 to 80 moves; the 8-puzzle can be solved in 31 moves or fewer.

Example 8.6

Knapsack problem

Knapsack problem is defined as a selection problem where one should choose boxes to fill the bag so as to maximize the amount of money, maintaining the overall weight within capacity of the bag. It is a multiple-constraint problem which considers weight, volume, shape, size and cost. It is also a decision problem as the bag needs to be filled with most useful items without exceeding a given weight W.

In Figure 8.5, we have n kinds of items, 1 through n. Each kind of item has a value pj and a weight wj. We usually assume that all values and weights are non-negative. The maximum weight that we can carry in the bag is W. Fig. 8.5 Instance of Knapsack Problem

In general, the problem of 0−1 knapsack is formulated as follows:

Maximize subject to The bounded knapsack problem restricts the number xj of copies of each kind of item to a maximum integer value bj. Mathematically the bounded knapsack problem can be formulated as follows:

Maximize subject to Note: Unbounded knapsack problem places no upper bound on the number of copies of each kind of item.

Of particular interest is the special case of the problem with the following properties:

1. it is a decision problem,
2. it is a 0−1 problem,
3. for each kind of item, the weight equals the value: wj = pj.

#### 8.9.1 Computational Complexity

The knapsack problem is interesting from the perspective of computer science because

1. Using Dynamic Programming approach, it takes pseudo-polynomial time.
2. There is a fully polynomial-time approximation scheme, which uses the pseudo-polynomial time algorithm as a subroutine
3. The problem is NPC to solve exactly. Hence, it is expected that no algorithm can be both correct and fast on all cases.
4. Many cases that arise in practice and ‘random instances’ from some distributions, can nonetheless be solved exactly.

One theme in the research literature is to identify what the ‘hard’ instances of the knapsack problem look like, or, viewed another way, to identify the properties of instances in practice that might make them more amenable than their worst-case NPC behaviour suggests.

Several algorithms are freely available to solve knapsack problems, based on dynamic programming approach, on branch and bound approach or on hybrids of the two approaches.

Solutions: If all weights (w1, ..., wn and W) are non-negative integers, the knapsack problem can be solved in pseudo-polynomial time using dynamic programming. The following describes a dynamic programming solution for the unbounded knapsack problem.

We assume that all weights are strictly positive (wj > 0) and wish to maximize the total value, subject to the constraint that total weight is less than or equal to a fixed W.

For each weight Y ≤ W, define A(Y) to be the maximum value that can be attained with total weight less than or equal to Y. The properties of A(Y) are as follows:

1. A(0) = 0 (the sum of zero items, i.e. the summation of the empty set)
2. A(Y) = max {pj + A (Y − wj) | wj ≤ Y}

where pj is the value of the jth kind of item.

Solution is obtained by calculating the values from A(0) up through A(W). Since the calculation of A(Y) involves examining n items, and there are W values of A(Y) to calculate, the running time of the dynamic programming solution is O(nW).

Dividing w1, ..., wn, W by their greatest common divisor is an obvious way to improve the running time.

The O(nW) complexity does not contradict the fact that the knapsack problem is NP-complete, since W, unlike n, is not polynomial in the length of the input to the problem. The length of the input to the problem is proportional to the number of bits in W, not to W itself.

#### 8.9.2 0−1 Knapsack Problem

A similar dynamic programming solution for the 0−1 knapsack problem also runs in pseudo-polynomial time. As above, we define A(j, Y) to be the maximum value that can be attained with weight less than or equal to Y using items up to j.

We can define A(j, Y) recursively as follows:

1. A(0, Y) = 0
2. A(j, 0) = 0
3. A(j, Y) = A(j − 1, Y) if wj > Y
4. A(j, Y) = max {A(j − 1, Y), pj + A(j − 1, Y − wj)} if wj ≤ Y.

The solution can then be found by calculating A(n, W). To do this efficiently, we can use a table to store previous computations. This solution will therefore run in O(nW) time and O(nW) space, although with some slight modifications we can reduce the space complexity to O(W).

Example 8.7

Clique problem

A clique in a graph is a set of pair wise adjacent vertices or, in other words, an induced subgraph that is a complete graph. In the graph shown below, the vertices 1, 2 and 5 form a clique as the subgraph formed with these nodes is a complete graph. The clique problem is the problem of determining whether a graph contains a clique of size at least k. Once we have located k or more vertices, it is trivial to verify if they form a clique. This is why the clique problem is in NP. The corresponding optimization problem, the maximum clique problem, is to find the largest clique in a graph.

In computational complexity theory, the clique problem is a graph-theoretic NPC problem. The NP-completeness of the clique problem follows trivially from the NP-completeness of the independent set problem, because there is a clique of size at least k if and only if there is an independent set of size at least k in the complement graph. This is easy to see, since if a subgraph is complete, its complement subgraph has no edges at all.

A brute force algorithm to find a clique in a graph is to examine each subgraph with at least k vertices and check if it forms a clique. This algorithm is polynomial if k is the number of vertices, or a constant less than this, but not if k is, say, half the number of vertices.

A heuristic is to start by considering each node to be a clique of size one, and to merge cliques into larger cliques until there are no more possible merges. Two cliques A and B may be merged if each node in clique A is adjacent to each node in clique B. This requires only linear time (linear in the number of edges), but may fail to find a large clique because two or more parts of the large clique have already been merged with nodes that are not in the clique. The algorithm can be implemented most efficiently using the disjoint-set data structure.

Example 8.8

Travelling salesman problem The Travelling Sales Person (TSP) is stated as a problem to find a shortest possible tour that visits each city exactly once in the given set of cities and the distance between a pair of cities. This problem is formulated as an optimization problem represented as a graph and is used as a benchmark for many optimization methods.

TSP can be modelled as a graph, where the cities are the vertices, graph is considered as and the edges are the distances between the two connected cities a complete graph. The optimal TSP tour is the shortest Hamiltonian cycle. The no edge exists between two cities, then a new edge with arbitrarily long distance is added without affecting the optimal tour. This can also be formulated as either directed or undirected graph depending on whether there exists path in one or both directions.

#### 8.9.3 Computational Complexity

Travelling Sales Person (TSP) problem is defined to be NP-hard for the case when the cities are in the plane with Euclidean distances, as well as in a number of other restrictive cases. Removing the condition of visiting each city ‘only once’ does not remove the NP-hardness, since it is easily seen in the planar case, that there is an optimal tour that visits each city only once.

The problem when formulated as a decision problem where ‘given the costs and a number x, decide whether there is a round-trip route cheaper than x’ would be an NPC problem.

In the theory of computational complexity, the decision version of TSP belongs to the class of NPC problems. Thus, it is assumed that there is no efficient algorithm for solving TSPs. In other words, it is likely that the worst-case running time for any algorithm for TSP increases exponentially with the number of cities. So, even some instances with only hundreds of cities will take many CPU years to solve exactly.

#### Solved Problems

Problem 1: Show that the travelling salesman problem is in class NP.

Solution: The input to the travelling salesperson problem (TSP) is a graph with integer weights on the edges as shown in the Figure 8.6 and a weight limit w. Fig. 8.6 Simple Weighted Graph

The question is whether the graph has a Hamiltonian circuit of total weight w. A Hamiltonian circuit is a set of edges that connects the nodes into a single cycle with each node appearing exactly once. Note that the number of edges on a Hamiltonian cycle must be equal to the number of nodes in the graph. If there was a real computer that was non-deterministic, no branch would use more than O(n) steps if the input was of length n. On a multiple tape TM we can guess a permutation in O(n2) steps and check its total weight in a similar amount of time; thus, a single-tape NTM can solve the TSP in at most O(n4) time. We conclude that TSP belongs to the complexity class NP.

Problem 2: Show that Kruskal’s algorithms is in class P.

Solution: These algorithms were developed by Joseph Kruskal. Kruskal algorithms create a minimum spanning tree T by adding the edges one at a time to T. A minimum cost spanning tree is built edge by edge. We start with the edge of minimum cost. However, if there are several edges with the same minimum cost, then we select one of them and add it to the spanning tree T provided its addition does not form a cycle. We then add with the next lower cost, and so on. We repeat this process until we have selected N−1 edges to form the complete minimum cost spanning tree. This algorithm selects the edges for addition in the minimum spanning tree in the increasing order of their cost. We have to remember here that the edges can only be added if it does not form a cycle. Let us take the following graph to find out the MST using Kruskal algorithms. We can organize the whole process in a tabular form with three rows and the number of columns equal to the number of edges. The first row contains the edges in the descending order of their cost, the second row contains the cost and the third contains A if the corresponding edge is added. This can be shown for the fiven graph as shown in Figure 8.7. Fig. 8.7 Simple Weighted Graph The complexity of finding the MST using prims’s method id O(n2), where n is number of vertices, for Kruskal method, the complexity is O(e2), where e is the number of edges. So, the problem of finding the MST belongs to class P.

Problem 3: Show that the satisfiability problem is in class NP.

Solution: The Boolean Satisfiability problem (SAT) is a decision problem considered in complexity theory. Suppose we have a Boolean expression which is made up of the variables (x1, x2, x3 … xn), parentheses and Boolean operators ∨, ∧and ¬ where these operators are for logical OR, AND and NOT respectively. A truth assignment for a Boolean expression depends on the values to the variables so that the whole expression is true. Now the question that arises is as follows. Do there exist values of the logical variables (x1, x2, x3… xn) to make a given Boolean expression true?. Thus SAT is used to determine whether there exists a true or false assignment to the variables such that all clauses are evaluated to be true making entire expression true? The Boolean expression is said to be satisfied if truth values can be assigned to its free variables in such a way that the formula becomes true. SAT clearly belongs to the complexity class NP because we can guess a truth assignment and verify that it satisfies the Boolean expression in polynomial time.

#### Summary

1. The problems which have Yes or No answer are called decision problems.
2. Problems that have an efficient algorithm to solve it are called decidable problems.
3. Problems that cannot be solved with an algorithm are undecidable problems.
4. The class of decision problems that can be solved in non-deterministic polynomial are called NP problems.
5. A problem is said to be NPC if it is contained in the class NP and all other problems in NP can be polynomially reduced to it.
6. A language is said to NPC if, L is in NP and for every language L’ in NP, there is a polynomial-time reduction of L’ to L.
7. If any NPC problem is in P, then P = NP.
8. P is the set of languages L such that L = L(M) for some non-deterministic TM M of time complexity T(n) where T(n) is a polynomial.
9. NP is the set of languages L such that L = L(M) for some non-deterministic TM M, where on any input of length n, there are no sequences of more than T(n) moves of M, where T(n) is a polynomial.

1. Can P and NP complete problems be NP hard.

Answer: All NP complete problems are NP hard, but all P problems are not NP hard.

2. What is the relation between NP hard and NP complete.

Answer: All NP – complete problems are NP hard but some NP hard problems are known not to be NP complete.

3. Give an example for NP-hard problem and justify the statement.

Answer: SUM of SUBSET problem is said to be NP-Hard as it is a decision problem where given set of integers where and need to find the sum of them that adds up to zero.

4. What is the time complexity of Hamiltonian cycle and classify the type of problem.

Answer: The time complexity of Hamiltonian cycle is O(m*n) where m is the number of edges in the graph and n is the number of nodes in the graph.

5. Is travelling sales person problem is NP Complete?

Answer: Travelling sales person problem is NP Complete as the time complexity of any algorithm for TSP increases exponentially as number of cities increase.

#### Fill in the Blanks

1. __________ are those problems that can be solved by a Turing machine in polynomial time.
2. _____________ are those problems that can be solved in non-deterministic Turing machine in polynomial time.
3. Search problems are ____________problems.
4. Sum of subsets problem is ____________ problem.
5. If P = NP, then NP-complete problem is in ___________.
6. NP-complete problem is the class of problems which are NP-hard and belong to NP. (State True/False)
7. Every NP problem reduces to SAT. (True/False)
8. If any NP-complete problem is polynomial time solvable, then P = NP. (True/False)
9. Travelling salesman problem is NP complete problem. (True/False)
10. A is undecidable if A is reduced to B and B is undeciadable. (True/False)

1. P problems
2. NP
3. NP-hard
4. NP-hard
5. P
6. False
7. True
8. True
9. True
10. True

#### Objective Question Bank

1. An NP hard problem is a problem
1. to which all the NP problems are polynomially reducible.
2. which is harder than NP problem.
3. which is not in NPC.
4. which is in NP.
2. Which of the following statements is incorrect?
1. P is a class of polynomial time or quick calculations.
2. NP is the class of search problems.
3. NP-complete is the hardest search problems.
4. None of the above.
3. Which of the following statements is correct?
1. If any NP-complete problem is contained in P, then we can conclude that P = NP.
2. If any NP-complete problem is contained in P, then we can conclude that P ≠ NP.
3. A problem is said to be computable if there exists an algorithm to solve it.
4. A problem is said to be tractable if there exists an efficient algorithm to solve it.
4. Which of the following problems belongs to the NP-complete complexity class?
1. Hamiltonian cycle.
2. Satisfaction problem (SAT).
3. 3-colouring.
4. All the above.
5. If A is reducible to B and B is undecidable, then
1. A is decidable.
2. A is undecidable.
3. B is decidable.
4. None of these.
6. Which of the following problems is NP-complete?
1. travelling salesman problem.
2. 3-colouring.
3. set partitioning.
4. All the above.
7. Which of the following statements is not correct?
1. If any problem in NP is not polynomial-time solvable, then all NP-complete problems are polynomial-time solvable.
2. If any problem in NP is not polynomial-time solvable, then all NP-complete problems are not polynomial-time solvable.
3. If any NP-complete problem is polynomial-time solvable, then P ≠ NP.
4. All the above.
8. Which of the following statements is not correct?
1. NP = P iff the satisfibility problem is a problem.
2. P is the class of problems which can be solved by a deterministic polynomial algorithm.
3. The circuit satisfaction is NP hard.
4. If any NPC problem can be solved in polynomial time, then all NP problems can be solved in polynomial time.
9. Problem A is NP complete iff
1. A is NP-hard.
2. A is NP.
3. Both a and b are true.
4. A is in complexity class of NP.
10. Intractable problems are
1. Not solvable.
2. Not in NP.
3. Not in P.
4. In NP-complete.

1. a
2. d
3. b
4. d
5. b
6. d
7. a
8. c
9. c
10. c

#### Exercises

1. Determine whether the following problems are P, NP or NP complete.
1. Satisfibility Problem.
2. Travelling Salesman Problem.
3. Knapsack Problem.
4. Set Partitioning.
5. Hamiltonian Cycle.
2. Explain the different complexity classes.
3. What is the difference between NP-hard problems and NP-complete problems.
4. Show that finding minimum spanning tree is in Class P.
5. How are P problems different from NP-problems?