Our general topics:Our general topics: Top
The quotesThe quotes <-
IntroductionIntroduction Top
What follows is an extremely abbreviated look at some of the important ideas of the general areas of automata theory, computability, and formal languages. In various respects, this can be thought of as the elementary foundations of much of computer science. The area also includes a wide variety of tools, and general categories of tools ...
Symbols, strings and languagesSymbols, strings and languages Top
We should start with a few definitions. The first step is to avoid defining the term `symbol' - this leaves an open slot to connect the abstract theory to the world ...
We define:
|
We let
|
Exercise: Verify each of these statements.
|
|
Pf: First, it is easy to see that
|
On the other hand, no function f : S ® P(S) can be onto. To show this, we need to
exhibit an element of P(S) that is not in
the image of f. For any given f, such an element (which must
be a subset of S) is
|
If s Î f(s) = Rf, then by the definition of Rf, s Ï f(s). This is a contradiction.
If s Ï f(s) = Rf, then by the definition of Rf, s Î Rf = f(s). Again, a contradiction.
Since each case leads to a contradiction, no such s can exist, and hence f is not onto. QED
Exercise: Show this. (Hint: show that \mathbbR is the same size as (0,1) = {x Î \mathbbR | 0 < x < 1}, and then use the binary representation of real numbers to show that |P(\mathbbN) | = |(0,1) |).
There are languages that cannot be recognized by any computation. In other words, there are languages for which there cannot exist any computer algorithm to determine whether an arbitrary string is in the language or not.
To see this, we will take as given that any computer algorithm can be expressed as a computer program, and hence, in particular, can be expressed as a finite string of ascii characters. Therefore, since ASCII* is countably infinite, there are at most countably many computer algorithms/programs. On the other hand, since a language is any arbitrary subset of A* for some alphabet A, there are uncountably many languages, since there are uncountably many subsets.
No royal roadNo royal road <-
There is no royal road to logic, and really valuable ideas can only be had at the price of close attention. But I know that in the matter of ideas the public prefer the cheap and nasty; and in my next paper I am going to return to the easily intelligible, and not wander from it again.Finite automataFinite automata Top- C.S. Pierce in How to Make Our Ideas Clear, 1878
In this context when we talk about a machine, we mean an abstract rather than a physical machine, and in general will think in terms of a computer algorithm that could be implemented in a physical machine. Our descriptions of machines will be abstract, but are intended to be sufficiently precise that an implementation could be developed.
We think in terms of feeding strings from A* into the machine.
To do this, we extend the transition function to a function
|
|
We can then define the language of the machine by
|
We can think of the machine M as a recognizer for
L(M), or as a string processing function
|
We extend the transition function to [^(d)] : S x A* ® P(S) in much the same way:
|
|
A useful fact is that DFAs and NFAs define the same class of languages. In particular, given a language L, we have that L = L(M) for some DFA M if and only if L = L(M¢) for some NFA M¢.
Exercise: Prove this fact.
In doing the proof, you will notice that if L = L(M) = L(M¢) for some DFA M and NFA M¢, and M¢ has n states, then M might need to have as many as 2n states. In general, NFAs are relatively easy to write down, but DFAs can be directly implemented.
|
Exercise: Show that the class of languages defined by NFAs with e-moves is the same as that defined by DFAs and NFAs.
Here is a simple example. By convention, the states in F are double circled. Labeled arrows indicate transitions. Exercise: what is the language of this machine?
|
Regular expressions and languagesRegular expressions and languages Top
Fortunately, there is a nice way to do this. The class of languages defined by Finite Automata are called Regular Languages (or Regular Sets of strings). These languages are described by Regular Expressions. We define these as follows. We will use lower case letters for regular expressions, and upper case for regular sets.