COMP 590 – String Algorithms – Fall 2015

Course Syllabus

  • Course URL:
  • CI Catalogue URL:
  • Instructor: Michael Soltys <>
  • Twitter: MichaelMSoltys
  • Course Outline: While this is an Advanced Topics in Computer Science course, we are going to present the foundations of String Algorithms from the beginning, starting with properties such as regularity, context-free languages, and rewriting systems. The course will cover the main ideas in combinatorics on words, and it will be based on lecture notes prepared by the instructor.
  • Lectures: T 6:00-8:50 in Sierra Hall 1422
  • Lecture Notes: Strings, Languages, and Computation, by Michael Soltys (updated Sep 8, 2015)
  • Grading: Five assignments, each worth 20% of the course grade. Each assignment to be done in a group of at least two, and at most three. The assignments will consist in problems to be solved, all of them in the style: “Show that the following is true…”

Announcements and class diary

  • Dec 1: Rice’s theorem, and the Post Correspondence Problem: an undecidable problem without a computation model in the background.
  • Nov 24: Decidability and Recursively Enumerable languages; we will finish the notes during the last lecture next week.
  • Nov 17: We finished the presentation of the paper SQUARE is NP-hard. We motivated NP-completeness in terms of SAT, UNSAT, and TAUT.
  • Nov 10: Continued presenting the material from the paper on unshuffling the square.
  • Nov 3: Started chapter 4 in the notes, and covered the material that links to the paper of square shuffles that we are reading. We defined Turing Machines, and discussed the encodings of instantiations of models of computation with strings.
  • Oct 27: Reviewed solutions to assignment 3; we showed how to do the SUFFIX(A) questions with a PDA. We covered the first 5 slides of the Square Shuffle result.
  • Oct 20: We spoke about the hierarchy of classes: regular, CFL, CSL, rewriting systems, as well as P, NP, (N)PSPACE, and others. We gave an introduction to NP-hardness; we defined the square shuffle problem – good stuff!
  • Oct 13: We started chapter 3 of the notes: Context-Free languages. We did basic definitions, and examined ambiguity. PDAs also recognize CFLs, just as CFGs. Can you show that the example of a CFL ({a^ib^jc^k:i=j or j=k}) is inherently ambiguous? Can you show that deterministic PDAs are weaker than regular PDAs? A paper was distributed: Unshuffling a square is NP-hard; this paper is related to PDAs and it solved an interesting open problem. We are going to discuss it for the next few lectures.
  • Oct 6: Myhill-Nerode Theorem and the indistinguishability relation. We then talked about Exercise 1.15, and reduced one direction to the proof of two minor Claims.
  • Sep 29: Class canceled due to the on campus incident.
  • Sep 22: How to show that languages are not regular? Impossibility results are difficult, as we have to show that all DFAs fail to recognize a given language. We show how to work with the Pumping Lemma, and how it is really the Pigeon Hole principle in disguise.
  • Sep 15: Regular languages can be defined with DFAs, NFAs, and Regular Expressions. We discussed problem 1.11 on the assignment.
  • Sep 8: An outline of the course: alphabets, strings/words, languages, classes of languages. We then spent time on exercise 1.6: find all solutions to (uv)^n=v^nu^n; we saw that it is important to get the quantification right – what we want to prove is the following statement: for all n, [if (uv)^n=v^nu^n, then there exists a w,k,l such that u=w^k and v=w^l]. This was after we tried different hypothesis for a solution description of the equation. The Basis Case, n=1, is itself shown by induction on |u|+|v|. I find this to be a very interesting problem, that requires clear and precise thinking – a good start to the course.
  • Sep 1: The MU-Puzzle, and we worked on exercise 1.5.
  • Aug 25: First meeting; we covered the first 6 pages of the notes.


  • Assignment 1: due on September 22: submit exercises 1.3-8, 1.10-12
  • Assignment 2: due on October 6 (extended to Oct 13): submit exercises 1.13,14,15
  • Assignment 3: due on October 23: submit exercise 2.13, as well as a solution to the following problem: Let SUFFIX(A)={v|uv in A for some string u}. Show that if A is a CFL, then so is SUFFIX(A).
  • Assignment 4: due on November 10: Recall the operators NOPREFIX and NOEXTEND, presented in class: given a language A, NOPREFIX(A) is the set of w’s such that no proper prefix of w is in A; on the other hand, NOEXTEND(A) is the set of w’s such that w is not the proper prefix of any string in A. Show that if A is regular, so are NOPREFIX(A) and NOEXTEND(A). Also show, that the same does not hold for CFLs.
  • Assignment 5: due on November 24: Consider QAs, which are PDAs equipped with a queue rather than a stack. How does this model of computation correspond to Turing Machines and PDAs?