251 Final Review

Table of Contents
  1. 1. Deductive System
  2. 2. Propositional Logic
  3. 3. Formalization of Proof
  4. 4. Finite Automata
  5. 5. Computational Complexity
  6. 6. Polynomial time and the class P
  7. 7. Basics
    1. 7.1. Matching
      1. 7.1.1. Maximum Matching
      2. 7.1.2. NP and NP-completeness
      3. 7.1.3. Problems
  8. 8. Probability

Deductive System

A Deductive System consists of

  • One or more initial objects
  • One or more deduction rules

Propositional Logic

doesn’t have quantifiers, or “First Order Logic”

  • Propositional Variables: basic statements that can be either true or false, like $p, w, r, x_1, x_2, x_3$
  • Connectives:
    • Not $\neg$
    • And $\wedge$
    • Or $\vee$
    • Implies $\rightarrow$
    • If and Only if $\leftrightarrow$

formula: Connectives and P.V.
well-formed formula:

  • Initial: Any variable
  • Deduction Rules: From $A$, can obtain $\neg A$. From $A, B$, can obtain $A\vee B$, $A\wedge B$, $A\rightarrow B$, $A\leftrightarrow B$.

A formula is a binary tree in which:

  • 2-child nodes are labeled by binary ops
  • 1-child nodes are labeled by unary ops
  • 0-child nodes are labeled by variables

Truth Assignment V: setting of true and false for each variable
Given a formula $S$, its truth value $V[S]$ can be defined by structural induction

Satisfiability:

  • $V$ satisfies $S$: $V[S] = T$
  • $S$ is satisfiable: $\exists V$ such that $V[S] = T$
  • $S$ is unsatisfiable: $\forall V, V[S] = F$
  • $S$ is tautology: $\forall V, V[S] = T$

Equivalent: Formulas $R$ and $S$ are equivalent, $R \equiv S$ if $\forall V, V[R] = V[S]$.

Entailment: Formulas $A_1, …, A_m$ entail formula $S$, written $A_1, …, A_m \models S$ if every truth assignment which makes $A_i$ true makes $S$ true.

Truth table: It represents a Boolean function, $f: {0, 1}^n \rightarrow {0, 1}$.

  • There are $2^{2^n}$ truth tables on n variables.
  • Every truth table can be computed by some formula, only using $\neg, \vee, \wedge$

Circuits

  • In circuits, nodes (gates) may have fan-out > 1.
  • Formulas are trees: all nodes have fan-out 1.
  • Circuits can reuse already-computed pieces. Formulas cannot; everything must be “rebuilt”.
  • Deduction viewpoint: The circuit is the deduction. The formula is the last line.

Formalization of Proof

First Order Logic = Prop. Logic + {$\forall$, $\exists$, $=$, “constants”, “relations”, “functions”}
Variables are now objects.
Vocabulary is a collection of constant-names, function-names, relation-names.
Interpretation

  • Specifies the universe
  • Maps constant-name to object
  • Maps relation-name to actual relation
  • Maps function-name to actual function
    Satisfiability/Tautology: Similar to that in first order logic.

Finite Automata

Alphabet: A nonempty finite set $\Sigma$ of symbols, $\Sigma = {0, 1}$ usually.
String: A finite sequence of 0 or more symbols.

  • The length-0 string is $\epsilon$
  • $\Sigma^n$ means all strings with length n.
  • $\Sigma^{*}$ means all strings over $\Sigma$

Language: A collection of strings. A subset of $\Sigma^{*}$.

Thus, we can think of a decision problem as a function:
$f: \Sigma^{*} \rightarrow {NO, YES}$
Decision problem is finding a language L such that f(element in L) is true

DFA: $L(M) = x\in \Sigma^{*}: M \text{accepts } x$.

A deterministic finite automaton is a 5-tuple:

  • $Q$ is a nonempty finite set of states
  • $\Sigma$ is an alphabet
  • $\delta: Q * \Sigma \rightarrow Q$ is the transition function
  • $q_0\in Q$ is the start state
  • $F\subset Q$ is the set of accepting states

Accept
Let $w = w_1w_2w_3…w_n$
We say that M accepts string w if there exists states $r_0r_1…r_n\in Q$ such that

  • $r_0 = q_0$
  • $\delta(r_{t-1}, w_t) = r_t$ for all $t$
  • $r_n\in F$.
    Otherwise, we say $M$ rejects $w$.
    The sequence $r_1r_2…r_n$ is called the computation trace.

Regular Languages
A language is regular if there is a DFA that decides it

Proving a language is not regular

  1. Assume for contradiction there is a DFA M which decides language L
  2. Argue there are two strings x, y which reach the same state in M.
  3. Show there is a string z such that $xz\in L$ and $yz\notin L$.
    Example: $a^nb^n$
  • $L_1\cup L_2$ is regular if $L_1$ and $L_2$ are.(Union Theorem)
  • $L_1\dot L_2$ (concatenation)
  • $L^{*}$ is regular if $L$ is.

More on Regular Languages

  1. The empty language is regular
  2. The singleton of empty string is regular
  3. The singleton of each character in the alpahbet is regular
  4. Union
  5. Concatenation
  6. Kleene star
  7. In fact, all set operations can be performed
    (On how to prove a language is regular)[http://cseweb.ucsd.edu/~clbailey/ClosureProofTemplate.htm]
    (On how to prove regular languages are closed under reverse)[http://web.cecs.pdx.edu/~hook/cs581sp11/reverse.pdf]
    Pumping Lemma Not really understand
  • Injective: $a \neq a’ \rightarrow f(a) \neq f(a’)$
  • Surjective: $\forall b\in B, \exists a\in A, f(a) = b$
  • Bijective

$\mathbb{N} = \mathbb{Z}$
Countable = finite or countably infinite
Cantor’s Theorem
For any non-empty set $A$, $|A|\lt |P(A)|$. This can be proven by diagonalization. $S = a\in A: a\notin f(a)$ $S$ is defined by diagonalization so that $S$ cannot equal any $f(a)$.
Continuum Hypothesis (Hilbert’s 1st Problem)
There is no set $S$ such that $|\mathbb{N}|\lt |S|\lt |P(\mathbb{N})|$

String encoding: Computational problems can be serialized into a string

Computational Complexity

  • Every multitape TM has an equivalent single tape TM.
  • Running time depends on the particular model you choose
  • Running time is defined as $T_A(n) = max{# \text{steps A takes on I}}$ (max input instance)
  • intrinsic complexity is defined by $min$ (min algorithm).
  • Entscheidungsproblem: Decide whether a given statement is valid with a given set of axioms
  • To compute summations:
    • Rough bounding
    • Exact computation
    • Induction
    • Telescoping series ($\frac{1}{i(i+1)}$)
    • Comparison with an integral
  • Master Method
    For $T(n) \le a\dot T(\frac{n}{b}) + O(n^d)$,
    • $a = b^d$, $O(n^dlog n)$
    • $a < b^d$, $O(n^d)$
    • $a > b^d$, $O(n^{log_ba})$

Prove Countability or Uncountability:

  • countable cartesian product of countable sets
    • Prime factorization
  • Countable union of countable sets
  • Set of Finite Subsets of Countable Set
    • Induction

Strong Church-Turing Thesis: The intuitive notion of “efficiently computable” is captured by functions efficiently computable by a TM.

Polynomial time and the class P

$DTIME(T(n)) = $ the language decided by an $O(T(n))$ time algorithm.
$P=DTIME(n^k)$ union of all
$EXP=DTIME(2^{n^k})$ union of all
P is a subset of EXP.
We can prove that Polynomial time algorithms can solve every decidable problem
$HWTB$ is the language such that M takes at most $n^3$ steps. This cannot be decided in $O(n^2)$ steps, which can be proved with diagonalization algorithm.
Time Hierarchy Theorem
Let $T(n)$ be a time-constructible algorithm. $\epsilon > 0$. Then there is a problem which cannot be decided in time $T(n)$, but can be decided in time $T(n)^{1+\epsilon}$. $DTIME(T(n))$ is a PROPER subset of $DTIME(T(n)^{1+\epsilon})$.

Space Complexity

  • Use two-tape TM that has a work tape and an input tape that is read-only.
    $DSPACE(T(n)) = $ the language decided by an $O(T(n))$ space algorithm.
    PSPACE, L is log space. L is a subset of PSPACE.
    Space Hierarchy Theorem
    If a TM uses $S(n)\ge log_2(n)$, then it decides the language using $\le 2^{O(S(n))}$ time (because of the number of possible configurations)
  • $L\subset P$
  • $PSPACE\subset EXP$
    Circuit Complexity
    A circuit family is an infinite answer to infinite number of questions
  • The size of a circuit is the total number of gates, including inputs as gates too
  • Circuit complexity is the size of the minimal circuit family that decides the language
    Circuit complexity is ALWAYS bounded by $2^n$.
    $f_A(x_1, x_2, …, x_n) = (x_1\wedge f_A(1, x_2, …, x_n))\wedge (\not x_1\wedge (f_A(0, x_2, .., x_n)))$
    This has recurrence $s(n) \le 2s(n-1)+5$.
    W.T.S.: There is a language L whose circuit complexity is at least $2^n/4n$.
  1. Number of possible functions is $2^{2^n}$
  2. Number of circuits of size s is $\le 2^{4slogs}$
    Claim 2: for each gate, 2 bits for its type, 2logs bits for which gates the inputs are from.

Basics

  • Regular: all nodes have same degrees
  • Walk
  • Path: Walk with no repeated vertices
  • Cycle: a walk from u to u where the only repeated vertex is u
  • Tree: connected; n-1 edges; acyclic
  • Local Search guarantees to get $\ge \frac{m}{2}$
    • Each vertex has $\ge \frac{deg(v)}{2}$ edges cut because otherwise we can flip its color to get more cuts

Problem: Identify all nodes reachable from s.
BFS, DFS, AFS
AFS finds a spanning tree rooted at s (bag, mark)
Why does AFS halt?
If we put edge in the bag, get a spanning tree.
Full AFS finds all connected components
CFS: Cheapest First Search
MST: subset of edges of minimum total cost such that all vertices connected
Prim: put the cheapest edge in the bag. If use Priority Queue, $O(Elog(E))$ run-time.
MST Cut Property e is the cheapest edge from S to not S, then must be in the MST.
Kruskal: Go through edges, add as long as it does not make cycles
Boruvka: Add the cheapest edges out of each c.c.

Matching

Bipartite Graph: there is a bipartition X and Y such that each edge connects x and y.
Matching: a subset of edges that do not share an endpoint.
Maximum Matching: A matching with largest number of edges
Perfect Matching: A matching that covers all vertices
Stable matching: There is no unstable pair(u, v not matched to each other, but they prefer each other to their current partners)
GS-algorithm: While a man is unmatched: propose to w(highest in the list)

  • $O(n^2)$ (because number of proposals are $n^2$).
  • Perfect Matching
  • Stable
    GS always finds the Best Valid Partner
    Prove: Man always finds the best. Woman always finds the worst.

Maximum Matching

Augmenting path(with respect to matching M): edges alternate between in M and not in M.
Prove: If there is an aug. path, M is not maximum
Find Maximum Matching: start with an edge, find augment path, …
To find aug. path: direct edges in M from left to right.

NP and NP-completeness

NP: We have a simple proof to verify a solution

  • length of proof is polynomial
  • proof can be verified in polynomial time
    Cook-Levin Theorem: SAT is NP-complete

To show A is NP-complete:

  1. A is in NP
  2. Pick a NP-complete problem L
  3. Show L reduces to A

NP: Nondeterminsitic Polynomial Time

C-hard: C can be reduced to A $\forall C \in C$
C-complete: C is in $C$.

Problems

  • $HAM-CYCLE \le HAM-PATH$
    • For each edge $(u, v)$, add “tenacle” $u’ v’$.
  • $TSP \le HAM-CYCLE$
    • Set weight of original edges to be $0$, others be $1$.
  • $SUBSET-SUM \le 3-SAT$
    • For a $n$-var, $k$-clause 3-SAT, we build a table of vectors where lower $n$ bits, upper $k$ bits. If $x_i$ or $\not x_i$ then the lower $i$ bit is 1. If it is present in the $j$th clause, then upper $j$ is 1. For each upper column, we add two “slack rows” where one is 1, the other is 2.
    • If the $3-SAT$ is satisfiable, then we pick the rows representing true vars. We get first n bits 1s. We get each upper column 1, 2, or 3 (it cannot be 0 because there is at least 1 true var in each clause, it cannot be more than 3 because there are 3 vars in each clause). Then we pick corresponding slack rows to get all 4s on the upper bits. The reverse direction is similar.
  • $3-SAT\le VC$
    • Construct a triplet for each clause, a pair for each variable $x$: $(x, \not x)$.
    • For a truth assignment, we pick 1 from the triplet to be true, the other 2 to be false. Add the false ones to the vertex cover. Add the true variable to the vertex cover. With-in edges are covered by true variables. Edges from pair to triplet have two cases. If the end in the triplet is selected(false), it is covered. If not(true), it is covered by the end in the pair.
    • For a n+2k vertex cover, 1 vertex in the triplet is not covered by itself but by the end in the pair, which is true. So this variable is true.
  • $SAT\le 3-SAT$
    • Use parse tree to construct triplets with intermediate results.
  • $SUBSET-SUM\le PARTITION$
    • populate $x_1, …, x_n, 2t-a$, where $a$ is the sum
  • $SUBGRAPH-ISOMORPHIC$
    • CLIQUE reduces to this
  • $HALTS$

Probability

  • RandInt(m): 1, 2, 3, … m with $p=\frac{1}{m}$
  • Bernoulli(p): 1 with $p=p$

outcome: A leaf in probability tree

Sample Space: Set of all outcomes

Event: A subset of outcomes

Law of Total Probability: $p[b] = p[a] p[b|a] + p[a^c] p[b|a^c]$

Independence of Multiple Events: for all subsets, for all size.

Random Variable: function from sample space to reals.

Indicator: R.V. x which is 1 when event A occurs, 0 otherwise.