April 2006 Archives

2006-04-29 22:29:05

Richard Bellman

When I searched some informations about dynamic programming, I found that the inventor of dynamic programming is Richard Bellman. This technique is one of most famous techniques of designing algorithm. Nevertheless, I thought not many CS students know that the power of dynamic programming.

After googling, I found that Richard Bellman wrote some books about DP, but one of them make me surprise, i.e., "Dynamic programming and partial differential equations". I didn't know that Dynamic programming could deal with the problem of partial differential equations. It's really interesting. My innocence make me underestimate it.

Posted by AbnerCYH | Permanent Link

2006-04-29 15:51:16

Systematical Algorithms Design

This Thursday, I reported chapter 5 of ``Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties'', by G. Ausiello, P. Crescenzi, V. Kann, Marchetti-sp, Giorgio Gambosi, Alberto M. Spaccamela

This chapter is about randomized algorithms, and the most useful method is linear programming with random rounding.

I am convinced that the inventors of general approaches are greater than the inventors of special great algorithms. Linear programming (or Convex Programming) is the most systematical approach to design algorithms for combinatorial optimization. Algorithms for LP such as simplex algorithm, interior point methods, ellipsoid method, and projective method are not so efficient as the customized algorithms usually, however, in the NPC problems, the approach that formulate problem into convex program and then solve it by above algorithms becomes special powerful method because of lack of polynomial time algorithms.

In the other hand, for problems in P, linear programs are not so efficient, however, some researchers designed efficient algorithms based on LP formulization, like Martin Farber's paper about WMIDS on chordal graph.

I am studying design algorithm, however, I can't possess the knack of design algorithm. My thinking approach is too naive to handle complex problems. I thought this is bottleneck.

Posted by AbnerCYH | Permanent Link

2006-04-27 22:34:06

Last Week

Last week I were busy for preparing mid-term exam, This Monday was mid-term exam of Compiler Construction, and this Tuesday was homework deadline of Randomized Algorithm, and today I reported Michel X. Goemans's paper about semidefinite programming at Algorithm Seminar II.

The busiest days passed, however it is not the time to take rest. May 22, 2006 is the due date of the third checkpoint of Compiler Construction, and I need to finish the second checkpoint. Actually, it is not too busy for me, however, I will hard to catch up the progress of statistical physical and quantum computation.

Posted by AbnerCYH | Permanent Link

2006-04-21 17:33:28

Nondeterministic

I don't like this term too much, and I don't think this term is lucid enough as I mentioned before. In Sipser's book, he referred to parallel universes for explaining the underlying concept of this term. It's interesting but hard to accept. Sometimes it will cause the confusion between NP and co-NP. However, I didn't know who introduce this term.

Luckily, I found it as I surfed on the INTERNET few days ago.

Michael Oser Rabin (born 1931 in Breslau, Germany, today in Poland) is a noted computer scientist and a recipient of the Turing Award, the most prestigious award in the field.

The citation for the Turing Award, awarded in 1976 jointly to Rabin and Dana Scott for a paper written in 1959, states that the award was granted:

For their joint paper "Finite Automata and Their Decision Problem," which introduced the idea of nondeterministic machines, which has proved to be an enormously valuable concept. Their (Scott & Rabin) classic paper has been a continuous source of inspiration for subsequent work in this field.

Ref. Michael O. Rabin - Wikipedia, the free encyclopedia

Posted by AbnerCYH | Permanent Link

2006-04-16 09:25:23

Nothing Wrong

Ref. : Learned Helplessness

It is nothing wrong that one feels negative when fails in competition. Sometimes, It is the origin of growing and advancing.

Posted by AbnerCYH | Permanent Link

2006-04-15 06:54:00

Back to Foundations

When I first time visited my adviser prof. Tang, I remember I told him that I want to learn something interesting and foundations, which will not faded as new technology appeared. I thought it called TCS.

However, I think I didn't follow I words, I am too thin to do research. I should come back to the foundations.

The degeneration which I mentioned before, is not only because of lack exercise, but also bad study habits.

Here is my todo list.

Study permutation graph
Symbolic Logic and Mechanical Theory
Noting Suffix Trees
Algebraic Statistics for computation
Computability, complexity, and language
The Analysis of Algorithms
Quantum Computation
Statistical Physics (I am interested in it, because I want to study complex science. It might not necessary to finish a book of Statistical Physics.)

Is it too much? Maybe. Maybe not. However, I can't to learn it deeply, I just can browse it, I will say I understand it, but how I can say that without doing exercise?

Posted by AbnerCYH | Permanent Link

2006-04-15 04:01:36

Learned Helplessness

I had written some posts about competitions on my the other blog, which is in chinese for personal feelings and life. I thought that it is good that heterology of graduated students makes us learn to appreciate each other.

And I had written that I really felt helplessness in competitions. In my one of those posts, I quoted Joel Spolsky's words in his article "User Interface Design for Programmers -- Controlling Your Environment Makes You Happy",

Years later, when I got to college, I learned about an important theory of psychology called Learned Helplessness, developed by Dr. Martin E. P. Seligman. This theory, backed up by years of research, is that a great deal of depression grows out of a feeling of helplessness: the feeling that you cannot control your environment.

I were a sentimental kid, but I had a happy childhood overall. As I grew up, I became more and more unhappy. Sometimes I will escape competing. Eventually, I learned that the best way to prevent to hurt my ego is to do my best --- to avoid to involve in competing.

For example, I was love to do IQ games like Martin Gardner's "Mathematical Games" column in Scientific American, however, I nevermore do it now. As in the NSC project, I rarely felt happy. I have too long time without positive feedback.

However, I find that I can't dodge it anymore. The sword will sharper by sharpening. I discover the explicit degeneration of my intellect. Although I can't control my environment to makes me happy, I can lose with esteem

Posted by AbnerCYH | Permanent Link

2006-04-14 12:53:44

How to prove worst case?

Sometime we need to analysis time complexity of a problem in worst cases. However, how to prove that this case will be the worst case? I.e., prove no other cases' execute time more than it.

Posted by AbnerCYH | Permanent Link

2006-04-14 09:23:15

Note for compiler project :: Parser

Date: Wed Apr 12 23:24:31 2006 Finish C's grammers. The output of bison is

c.grammer.y contains 17 useless nonterminals and 54 useless rules

It might because C is too complex, it might be ok to ignore some grammers, since instructer didn't request to implement it completely.

c.grammer.y contains 6 shift/reduce conflicts and 44 reduce/reduce conflicts.

This part might be more difficult. Maybe I need to modified it by hand.

Date: Thu Apr 13 01:20:54 2006 Type "bison -v -t -d inputfile" on command line to produce imputfile.output for debuging!

I fixed some typo, and change to set %start as translation_unit. I don't know what it means? Doc. doesn't describe it clearly, however it seems for preprocessor, i.e., C parsers need to parse the whole program as "translation_unit" and then derive it to some external declaration, such as functions, macros, etc.

I might skip the preprocessor part of C.

Now, I improve the code with

c.grammer.y contains 10 shift/reduce conflicts and 54 reduce/reduce conflicts.

I don't know why? It might be caused by grammers, because some grammers might produce some cose as

typedef int shit;
shit shit var;

So it might fixed by coding other procedures, or modify grammers.

Date: Thu Apr 13 07:43:47 2006 The probelms mentioned earlier might be caused by that Bison only can eat LALR(1) grammers.

Date: Thu Apr 14 09:43:47 2006 I fiexd the code last night. those problems is cause by that Bison only can eat LALR(1) grammers.

I use the method "add dummy token" to solve those problems.

Now I needs to merge the codes of scanner into parser. I had a liitle bit confusions about the structure of flex and bison. However, the coding plan is that

rewirte the c.lex, because in the orignal code, I make it return its lexical classes, e.g., KEYWORD, INT. However, for merging it, it should be modified to returning its value, e.g., ID, int, ++, -=. The code of parser also needs to be modified. Jeff Lee's version used tokens instead of symbols such as INC for "++", but I use the symbols directly. It might cause some problems for passing value from flex's function to bison's function.

Posted by AbnerCYH | Permanent Link

2006-04-14 06:16:25

Ajax/Graphviz

I heard Graphviz after learning LaTeX a while. It seems great but not so easy to learn. I try to use the drawing packages of LaTeX to draw graphs in PDF file, but it costs many (file) size then PNG format graph.

I got informations about Ajax/Graphviz from BBS. It is great, at least it makes Graphviz seem not so hard to learn.

node ;
a [label="Aabner C.Y. Huang"];
b [label=" Works |  Interest"][shape=record];
c [label="NSC Project"];
d [label="WIDS Problem"];
e [label="MMIS Problem"];
f [label="Combinatorics"];
g [label="Complexity"];
h [label="Algorithms Design"];
i [label="Computational"];
j [label="Complex Science"];
k [label="Descriptive"];
l [label="Approximation"];
m [label="Symbolic Dynamics"];
a->b;
b:f0->c;
c->d;
c->e;
b:f1->f;
b:f1->g;
b:f1->h;
g->i;
g->j;
g->k;
h->d;
h->e;
h->l;
i->l;
f->m;
j->m;

Posted by AbnerCYH | Permanent Link

2006-04-12 06:47:39

Term Rewriting

Rewriting - Wikipedia, the free encyclopedia
In mathematics, computer science and logic, rewriting covers a wide range of potentially non-deterministic methods of replacing subterms of a formula with other terms.
Graph rewriting - Wikipedia, the free encyclopedia
In graph theory, graph rewriting is a system of rewriting for graphs. During the application of graph rewriting to a graph, subgraphs are replaced according to the rules of a rewrite system. Sometimes graph grammar is used as a synonym for graph rewriting system, especially in context of formal languages.
The RTA list of open problems
IFIP/RTA :: Rewriting Home Page
Equational reasoning is an important component in symbolic algebra, automated deduction, high-level programming languages, program verification, and artificial intelligence. Reasoning with equations involves deriving consequences of given equations and finding values for variables that satisfy a given equation. Rewriting is a very powerful method for dealing with equations.
This site collected many resources including surveys.
Information about the course Automated Reasoning (2R880), Dr. H. Zantema
Methods and Formal Models in Computer Science, Nachum Dershowitz
Seminar on Rewriting and Equational Reasoning, Nachum Dershowitz
Automated Deduction for Equational Logic 2003, Uwe Waldmann
REWRITE SYSTEMS, Spring 2002, N. Dershowitz
REWRITE SYSTEMS MINI-COURSE, Ben-Gurion University, April 1997, N. Dershowitz
CPSC 415: Special Topics in Computer Science: Logic and Computation, Fall 1998, Dr. Chuck C. Liang

About [2], I knew this because of clique-width problem, and then knew something called Handle-rewriting hypergraph grammars, or Node rewriting in graphs, therefore I knew that graph can be modeled by formal language. Three or four weeks ago, from LEAST seminar, I knew that we can use (stochastic) context-free grammars to modeling tRNA. Those things made me feel the term-rewriting important.

About [4], it contained some surveys. I am interested in the following two surveys.

D. Plump, Term Graph Rewriting, In Handbook of Graph Grammars and Computing by Graph Transformation, Volume 2: Applications, Languages and Tools, Chapter 1, pages 3-61, eds. H. Ehrig, G. Engels, H.-J. Kreowski and G. Rozenberg. World Scientific, 1999
Nachum Dershowitz and David A. Plaisted, 2001, ``Rewriting'', chap. 9 in: Handbook of Automated Reasoning, vol. 1, A. Robinson and A. Voronkov, eds., Elsevier, pp. 535-610. Reviewed in Zbl 0992.68123.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-11 08:36:13

CSI 2006 Call For Papers

CSI 2006 Call For Papers, Special Session on Graph Theory and Applications in The 9th International Conference on Computer Science and Informatics, in conjunction with The 9th Joint Conference on Information Sciences. October 8-11, 2006 Kaohsiung City, Taiwan, ROC

THEME AND SCOPE:

With the successes in series of CSI conferences and due to the extensive growing areas of Graph Theory, the CSI opened the special session for Graphs since 2003. The CSI 2006 continues to open a special session for Graph Theory and Applications. The session is intended to provide an international forum bringing together researchers, academics, and industrial technologists in all areas of graph theory and applications. Papers describing how graph theory can be applied to various areas in Computer Science, or by extracting and solving new problems from applications are welcome. The goal is to present newest research results and to identify and explore issues and directions of future research on Graphs and Their Applications.

Papers are solicited describing original results on all areas of Graphs, e.g., structural graph theory, sequential, parallel, and distributed graph and network algorithms and their complexities, graph grammar and graph rewriting systems, graph-drawing and layout, graph-based and digraph-based models, on-line, randomized, and approximation graph and network algorithms. IMPORTANT DATES:

Paper due: April 15, 2006 Authors notification: June 10 , 2006 Camera-ready paper due: July 15, 2006

Session Chair:

Professor Chung-Kung Yen Department of Information Management Shih Hsin University Taipei 116, Taiwan, ROC Tel: +886-2-22368225 Fax: +886-2-22367114

Posted by AbnerCYH | Permanent Link

2006-04-11 07:32:42

Job Market Signaling

Title: Job Market Signaling
Author(s): Michael Spence
Source: The Quarterly Journal of Economics, Vol. 87, No. 3. (Aug., 1973), pp. 355-374.
Stable URL: http://links.jstor.org/sici?sici=0033-5533%28197308%2987%3A3%3C355%3AJMS%3E2.0.CO%3B2-3
Abstract: I. Introduction 355.--2. Hiring as investment under uncertainty, 356.--3 Applicant signaling, 358.--4. Informational feedback and the definition of equilibrium, 359.--5. Properties of informational equilibria: An example, 361.--6. The informational impact of indices, 368.--Conclusions, 374.

In a news group post which is titled "signaling...", a guy mentioned this classic paper. I am not economics major, however I am very interested in this paper, because it is about "low-ability persons will not pursue the high educational background because they can't do it well."

Sad!

It is what I thought. Why do I stay here?

Posted by AbnerCYH | Permanent Link

2006-04-10 03:22:32

Reviewing Papers - Conflict of Interest

An Anonymous commentator gave an URL on prof. Fortnow's new post "Reviewer Ethics" for guidelines on conflict of interest.

Your Ph.D. advisor and Ph.D. students forever.
Family relations by blood or marriage forever (if they might be potential reviewers).
People with whom you collaborated in the last five years.
Collaborators include co-authors on an accepted/rejected/pending research paper, co-PIs on an accepted/rejected/pending grant, those who fund of your research, and researchers who you fund. You many exclude "service" collaborations like writing CSTB report or serving on a program committee together.
People who shared your primary institution in the last five years.
Others with whom you believe a conflict of interest exists.

Prof. Fortnow's opinion is good, but I think it might be not easy to obey, especially the part of making a significant extension.

Posted by AbnerCYH | Permanent Link

2006-04-09 23:53:12

Recent discussions

Recently, there were some questions about computational complexity from discussions. Although I don't attend the course of theory of computation, in order to do research I have learned a little of it by myself. Because I read it by myself, I had many questions. For example, I am confused by the definition of NFA or NDTM; by it definition, the transition function will map states to power set of states. This part is okay. But it is informal description of iterations of transition function, i.e., the NFA expands or to several NFAs for each state which is gotten from the last iteration's output, set of states. In other words, it is a kind of cleavage of parallel universes for each iteration of NFA. (This allegory is from Michael Sipser's book.)

It might be a formal description of iterations of transition function, but I really don't like it. I know the model of finite state machine is more comfortable to scientists/engineers than other models of computation like lambda calculus. However, the concept of non-deterministic machine is too oracular. Why don't we adopt another model? (Another question is why Church is not famous as Turing?)

In fact, Sanjeev Arora said that the computational model isn't matter, in his draft of upcoming textbook of computational complexity. Actually, it is the topic of chapter one of his book, i.e., "The computational model -- and why it doesn't matter." However, I don't know why it doesn't matter after reading chapter one. I think it might be my fault to interpret this sentence in wrong way. Sanjeev Arora seems to say that computational model isn't matter, because all models will be equivalent to UTM.

Therefore, I check out Martin Davis' book Computability, complexity, and languages : fundamentals of theoretical computer science, for learning some different computational model.

Besides I had few progress of other subjects, such as statistical physical, quantum computation, combinatorial optimization, randomized algorithms, compiler construction, symbolic dynamics, and so on. I deferred the study plan of cellular automata, because Cellular automata (CA)-type models seems to have less analytic results. And according to Jarkko Kari's survey paper, the CA research could be divided into three parts, computation, topological dynamics, and simulation. I am not interested in last one, and the others could be referred to other fields which I am interested in.

I prefer self-study than attending courses, thus I never fellowed instruction of lecturer. For example, the last two days, I spent my time on reading paper of approximation algorithms and Automated Reasoning, Knowledge Representation, and Automated theorem proving. :-(

Posted by AbnerCYH | Permanent Link

2006-04-09 11:22:45

Knowledge Representation

When I searched on the internet for Automated Reasoning, I found some applications about verification, logical programming, and so on. Most of them are not so umfamiliar to me, but it is hard to image what is knowledge representation.

What is a Knowledge Representation?
Knowledge representation - Wikipedia, the free encyclopedia. According to this doc., to date, this field seems not so close to automated reasoning. Recent study seems to intend to computational linguistics.
Representation AI Topics provides basic, understandable information and helpful resources concerning artificial intelligence, with an emphasis on material available online.
2.2 Knowledge Representation
Knowledge Representation
AI Education Repository
The Classic Knowledge Representation System
Databases and Artificial Intelligence 3 Artificial Intelligence Segment by Alison Cawsey, Knowledge Representation and Inference
John F. Sowa's Site Directory
CS4725 ---- Knowledge Representation, Columbia University, Professor Leora Morgenstern.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-09 11:06:16

Automated Reasoning

After the last LEAST Seminar, I had some interests on automated theorem proving so that I collected something here.

Lecture "Automated Reasoning", SS 2006 (Max-Planck-Institut f��r Informatik). The link `Literature' contains several docs.
University of Manchester, CS612: Automated Reasoning, Renate Schmidt
UCLA, CS264a: Automated Reasoning, Fall 2004, Professor Adnan Darwiche
King's College London, CS3AUR: Automated Reasoning 2002, Ulle Endriss
Mechanized Reasoning.
Stanford Encyclopedia of Philosophy, Automated Reasoning. I am specially interested in section 14. Deductive Computer Algebra.
To prove automatically even the simplest mathematical facts requires a significant amount of domain knowledge. As a rule, automated theorem provers lack such rich knowledge and attempt to construct proofs from first principles by the application of elementary deduction rules. This approach results in very lengthy proofs (assuming a proof is found) with each step being justified at a most basic logical level. Larger inference steps and a significant improvement in mathematical reasoning capability can be obtained, however, by having a theorem prover interact with a computer algebra system, also known as a symbolic computation system.
ORA Canada Bibliography of Automated Deduction
Automated Reasoning on CologNet which is the Network of Computational Logic. It seems a portal website including infomations about research, tools, etc. This web listed some applications of AR.
There are too many applications to list them completely. Automated reasoning is used in software and hardware verification, logic and functional programming, formal methods, knowledge representation, planning, constraint satisfaction and optimization, deductive databases, as well as many other areas of artificial intelligence.
PRL Project :: Automated Deduction
From Stephen Wolfram: A New Kind of Science, there are couple of pages about "Automated theorem proving".

I read the first handout of prof. Pfenning's course, and chpater 1-4 of prof. Lee's book, Symbolic Logic and Mechanical Theorem Proving. The two professors used different logic systems. Prof. Pfenning used intuitionistic logic and prof. Lee adopted first-order logic. Prof. Lee's book has very good reputations, although it might be a little out of date. However, after some works of searching, I know that many lecturers prefer first-order logic, although there are so many other logic systems, such as classical logic, intuitionistic logic, modal logic, relevance logic, higher-order logic, dynamic logic, temporal logic, linear logic, belief logic, and lax logic.

Chalmers University of Technology. Automated Theorem Proving, course given by Reiner H��hnle, April 6, 2005.
CSE 291: Applied Automated Theorem Proving Winter Quarter, 2006, by Sorin Lerner.
Automated Theorem Proving, by Arnon Avron & Mooly Sagiv, Spring Semester, 2006, Tel-Aviv University.
Automated Theorem Proving On-Line Course Materials Resource, maintained by David A. Plaisted.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-08 00:30:44

Suffix Tree

Today's Computational Biology talked about KMP, BM algorithms, and suffix tree.

I am most interested in suffix tree. I had thought that what is professional of a algorithmist? A algorithmist might be a academic researcher or a RD engineer. Here I put more emphasis on industry researcher. The concept of suffix tree is good and useful, but it's very complicated and hard to analyze. The first algorithm to build suffix trees in O(n) was discovered by P. Weiner, and then E. M. McCreight proposed another more readable and space economical algorithm. According to E. M. McCreight's paper

A. lucid description of Weiner's algorithm, with additional insights, appears in unpublished lecture notes by Knuth [4]. The state of the art of pattern matching, including Weiner's algorithm, is well presented in a book by Aho, Hopcroft, and Ullman [1].

Much later, Esko Ukkonen gave a on-line algorithm is presented for constructing the suffix tree. This algorithm is complex, and doesn't seems to be in O(n), but Esko Ukkonen use a lot of techniques to reduce its complexity.

Personally, I don't really like algorithms of this style. I like simple algorithms with seminal idea just like beauty criterion of mathematician. However, I think Conan's algorithm representing the professional spirit, the efforts to reducing time complexity, of algorithmists or algorithm engineers.

Lloyd Allison :: Algorithms and Data Structures Research & Reference Material :: Suffix Tree
Chris Lewis, Suffix Trees in Computational Biology
E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249-60, 1995. [Citeseer]
McGill University, School of Computer Science, Winter 1997 Class Notes for 308-251, DATA STRUCTURES AND ALGORITHMS, Topic #7: TRIES AND SUFFIX TREES
Jewels of Stringology, by Maxime Crochemore, Wojciech Rytter
Yuanyuan Tian and Sandeep Tata and A. Hankins and M. Patel, Practical methods for constructing suffix trees, The VLDB Journal, volume 14, number 3, 2005.
Practical Suffix Tree Construction, Sandeep Tata, Richard A. Hankins, Jignesh M. Patel, 2004

If there is a suffix in a string to be prefixes of other suffixes, then it will correspond to an implicit suffix tree which can end at an interior node. This tree lose the advantage of searching efficient. To improve it, we can append a "$" to string to build the explicit suffix tree, each suffix is explicit -- the suffix ends at a leaf node.

Ukkonen's Algorithm : E. Ukkonen. On-line construction of suffix-trees. Algorithmica, 14(3):249�V260, 1995. On-line algorithm of linear time and less space

I_i: the implicit suffix tree of the string S[1,i], i.e., Ukkonen's Algorithm take prefixes of S as inputs, and match suffixes of input for building suffix tree.

1. Construct I_1;
2. for i = 1 to m-1 do /* Phase i + 1 */
     for j = 1 to i + 1 do /* Extension j */
       /* Construct I_{i+1} from I_i */
       Match each suffix of I_{i+1} on suffix tree of I_i, 
       and extend that suffix into I_i by suffix extension
       rules;
     end for
   end for
3. Convert I_m into a true suffix tree of S$;

Suffix Extension Rules: S[j, i] �� S[j, i+1]

Rule 1: If a suffix of S[j,i+1] ends at a leaf of suffix tree of S[j,i], then attend the end-symbol of suffix to that leaf.
Rule 2: If a suffix of S[j,i+1] does not end at a leaf and the continued character x =/= S[i + 1] (add a new branch on it)
Rule 3: If some suffixes of S[j,i+1] are already in tree, then do nothing since S[j,i+1] is in I_i.

Because the substring is stored on the edge, so that "ends at a leaf" means that a suffix is matched the string on the path by from the root to a leaf except the end-symbol of suffix. Therefore, "attend the end-symbol of suffix to that leaf" actually means attend the end-symbol of suffix to the substring which is stored on the last edge of the path.

The key issue in implementing Ukknonen's algorithm is how to locate the ends of all the i+1 suffixes of S[1,i].

Naive: constructing I_{i+1} from I_i costs O(i^2). �� after constructing I_m, total cost is O(m^3).

Thus, Ukknonen used a lot of techniques to reduce the time complexity.

Suffix Link: The shortcuts between internal nodes.
Edge-Label Compression: Use indices of substring instead of characters.
In any phase, if suffix extension rule 3 applies in extension j, it will also apply in all extensions k, where k > j, until the end of the phase.
Once a Leaf, Always a Leaf

In phase i + 1, two extensions are considered:

Implicit extensions: Consecutive extensions, from first suffix of S[1,j+1] to the k-th extension, just apply rule 1.
Explicit extensions: The k+1-th extension from to the extension where rule 3 applies.

We can set the 2ed index on the edge as a pointer that point to a variable for updating all implicit extensions in each phase at once.

The cost of each explicit extension is a constant (possible up-walk and suffix-link traversal) plus some time proportional to the number of node skips it does during the down-walk in that extension. Since no node has depth >m and < 2m explicit extensions are executed, the total number of node skips done during all the down-walks is bound by 5m = O(m).

Time-complexity of Ukkonen��s algorithm: O(m)

P.S. According to C.L. Lu and C.Y. Tang's lecture notes of computational biology.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-07 23:59:19

Note on LEAST Meeting 20060407

Ref. : A 1.375-Approximation Algorithm for Sorting by Transpositions, Isaac Elias and Tzvika Hartman, Proceeding of the 5th International Workshop on Algorithms in Bioinformatics (WABI2005).

Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a ten-year-old open problem to improve the best known 1.5-approximation algorithm. In this paper we provide a 1.375-approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3-permutations. In addition, we present some new results regarding the transposition diameter: We improve the lower bound for the transposition diameter of the symmetric group, and determine the exact transposition diameter of 2-permutations and simple permutations.

The authors use the breakpoint graph as their model, but I totally lose at the seminar. :-(

However, it's funny that they use the method computer-assisted proof to exhaustively enumerate all cases. It is the first time I saw it on a paper, although I heard it long time ago.

Prof. Lee doubted the correction of their method. He argued that the algorithm didn't be proven to guarantee to generate all cases, so that it can't be thought as a proof. However, this paper was accepted by WABI2005 indeed. It means that this approach is acceptable in this field.

Computer-assisted proof - Wikipedia, the free encyclopedia
Automated reasoning - Wikipedia, the free encyclopedia
Automated theorem proving - Wikipedia, the free encyclopedia
Keith Devlin, January 2005, Last doubts removed about the proof of the Four Color Theorem
Martin Davis' short article in the HANDBOOK OF AUTOMATED REASONING: The Early History of Automated Deduction
15-815 Automated Theorem Proving, Fall 99, Frank Pfenning

Posted by AbnerCYH | Permanent Link | Categories: Notes, Seminar

2006-04-06 04:45:20

America's Best Graduate Schools 2007

Ref. : usnews.com: America's Best Graduate Schools 2007: The Sciences: Computer Science

Via Ernie's 3D Pancakes, full theory list is following.

   1. MIT
   2. Stanford
   3. Berkeley
   4. Cornell
   5. Princeton
   6. CMU
   7. Harvard
   8. Columbia
   9. Georgia Tech
  10. Washington
  11. Texas
  12. Chicago
  13. UIUC
  14. UCSD
  15. Yale
  16. Brown
  17. Wisconsin
  18. Caltech
  19. Duke, Penn (tie)
  20. Rutgers
  21. Maryland, Michgan (tie)

My library's academic database doesn't update yet so that I can't get the full-text of this report. However, the ranking of grad. school should be more delicate. I think it will be more useful. The comment of Janos Simon is great.

In short: grad school can be great fun. It is the time you really have very little else to do besides becoming a good scientist. Go all out to make the best of it!

PS. I find that Janos Simon is adviser of Shi-Chun Tsai who is instructor of course Randomized Algorithms I attend.

Posted by AbnerCYH | Permanent Link

2006-04-05 23:36:28

Algebraic Statistics

Introduction to Algebraic Statistics for Computational Biology (Powerpoint), April 2005. Lior Pachter
Combinatorial aspects of gene finding and alignment (Powerpoint), A series of two talks for the workshop on "Combinatorics and Combinatorial Aspects of Biology", New Plymouth, New Zealand, January 2003. Lior Pachter
Homepage of book Algebraic Statistics for Computational Biology. And authors' collections of Courses and seminars based on the book
Tutorial on Algebraic Statistics (Part III) Kimberly Sellers.
Evolution on partially ordered sets. Nicholas Eriksson, University of California, Berkeley, Slides
John von Neumann Lectures 2003 :: Algebra & Geometry of Statistical Models, by Bernd Sturmfels
Math 275: Topics in Applied Mathematics :: Algebraic Statistics for Computational Biology

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-05 21:41:21

Hypercomputation

It is a new word for me.

Hypercomputation - Wikipedia, the free encyclopedia
Hypercomputation Research Network
Hypercomputation: hype or computation?
Theoretical Computer Science, Volume 317, Issues 1-3, 4 June 2004, special issue for Super-Recursive Algorithms and Hypercomputation
Hypercomputing the Mandelbrot Set?

Update 2006-04-07:

From [1],

Today, algorithmic information theory helps us understand better what is required for hypercomputation. The hallmark of a hypercomputer is that it can solve the general halting problem, a feat impossible for ordinary computers. The hallmark of a hypercomputer is that it can solve the general halting problem, a feat impossible for ordinary computers.

(It's my first time heard the utility of algorithmic information theory.)

At this stage, none of these devices seem physically plausible. Thus, hypercomputers are likely to remain as mathematical models.

From [2],

Hypercomputation concerns the study of computation beyond that defined by the Turing machine, and is also known as super-Turing, non-standard or non-recursive computation.

In those introductory text, I am confused that they said Turing had proven the impossibility of hypercomputers, what the researchers pursue? The questions from [3] seemed to be all people's questions. i.e., measurement of infinite precision seems impossible. It will be affected by noisy and uncertainty principle.

The papers of [4], I do not read it yet. However, in its content, there are many interesting articles, such as "Hypercomputation: philosophical issues", "Natural computation and non-Turing models of computation", "Algorithmic complexity of recursive and inductive algorithms". I just need more spare time.

In [5], the author discuss the Mandelbrot set, i.e., $M = \{c \in \mathscr{C} | \forall n \geq 1, \|f^n_c(0)\| \leq 2\}$, where $f^c(x) = x^2 + c$, and $f^n$ denotes $n$-th iteration of $f$.

This shading scheme emphasises that the usual routines for generating Mandelbrot plots, like this one, clearly can identify only elements of the complement of M and not of the set itself. Scientists are well-aware this fact (that the complement of M is only known to be recursively enumerable) but it is usually not overemphasised in the more popular writing.

Without loss of generality, we shall assume a ZM to be identical to a Turing machine with one input tape, one output tape and a storage tape except that the ZM takes 1/2 hour to execute the first transition, 1/4 hour for the second, 1/8 hour for the third etc. After one hour the ZM will have finished its operation and one will perhaps find the answer to some tantalising question on the output tape.

Here ZM stands for Zeno Machine. The author showed that ZM is most powerful among other hypercomputation models. However, those idea seems too mathematical, I think that the ZM is impossible to realize. It seems the mathematical computability(constructive mathematics?) study.

Actually, I don't know what I say now. I just think that hypercomputation seems to become pure mathematical study, since the hypercomputers is no possibility to be created.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-05 17:17:13

Theory or Systems

Lance Fortnow posted a new entry "How Many Students?" on his blog. This is an interesting question for me, because my adviser have almost 10 master program students and 30 Ph.D. program students.

However, this post appeals comments about totally different question, i.e., What kind of research is harder? Theory or Systems? I don't know how Lance Fortnow feels about it. In spite of digression, there are many interesting thoughts, or at least there are many interesting biases.

According to the definition (or explanation) on ACM SIGACT's homepage, the TCS means

The field of theoretical computer science is interpreted broadly so as to include algorithms, data structures, complexity theory, distributed computation, parallel computation, VLSI, machine learning, computational biology, computational geometry, information theory, cryptography, quantum computation, computational number theory and algebra, program semantics and verification, automata theory, and the study of randomness. Work in this field is often distinguished by its emphasis on mathematical technique and rigor.

Although SIGACT try to exhaust the contents of TCS, there are lots of problems in it. For example, there are time constraints in computational geometry or VLSI; O(n^2) might be the upper bound of acceptable time complexity. Therefore many researchers tend to use heuristic algorithms to improve performance. That would require more experimental study than theoretical study.

2006-04-06 Update : I like the comment from Claire Kenyon.

So, it's much easier for us theorists to spend some fraction of our effort on "hard" problems, that is, problems on which we are almost clueless. On the other hand, at the end of the day, even in Theory, very few papers are truly surprising ...

Posted by AbnerCYH | Permanent Link

2006-04-05 03:07:33

Current works

This week is a vacation. Many issues are deferred. The only job might be our NSC project. After getting some results, we want to study the others algorithmic properties of chordal graph, such as approximation. I am reading a paper, "A Note on the Approximation of a Minimum-Weight Maximal Independent Set," written by Marc Demange.

This paper leads me to contact more details of computational complexity. Actually, our group have started a reading group named "Special Topics in Algorithms II". We shared chapters of the book, "Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties," by Ausiello G, Crescenzi P, Kann V, Marchetti-Sp , Gambosi G, Spaccamela AM

However, it is too slow.... :-(

In fact, although this paper is not so difficult to understand besides inferring by "obviously", I searched some data about the advanced topics on it especially PCP.

Inapproximability of Combinatorial Optimization Problems, Trevisan L
Approximating the minimum maximal independence number, Halldorsson MM
Pseudorandomness and Combinatorial Constructions, Trevisan L
Zero knowledge and the chromatic number, Feige U, Kilian J
Gowers Uniformity, Influence of Variables, and PCPs, Samorodnitsky A, Trevisan L
A New Parallel Algorithm for the Maximal Independent Set Problem, Goldberg M, Spencer T
On approximating the minimum independent dominating set, Irving RW
Probabilistic Proof Systems (A Survey), Oded Goldreich
Probabilistic checking of proofs and the hardness of approximation problems. Sanjeev Arora. Ph.D. Thesis, UC Berkeley, 1994.
Proof Verification and Hardness of Approximation Problems. Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan and Mario Szegedy. Journal of ACM, 45(3):501-555, 1998.
Probabilistic Checking of Proofs: A New Characterization of NP. Sanjeev Arora and Muli Safra. Journal of ACM, 45(1):70--122, 1998.

And a lot of other concepts of computational complexity, such as Propositional Proof Systems, Zero-Knowledge, Interactive Proof Systems, circuit complexity, etc.

I think that I should start to read books of computational complexity, but it is another problem.

I am too curious and too impatient, so that I never finish a book. Now, there are many books of different fields on my desktop like topology, quantum computation, symbolic dynamics, hypergraph, statistical physics, cellular automata, combinatorial optimization, linear programming, algebraic statistics, approximation algorithms....Orz!

^^|||....

Posted by AbnerCYH | Permanent Link

2006-04-03 23:40:34

Parallel Algorithms

I am interested in many different fields, such as complex science, algorithm design, combinatorics, computational complexity, economics, etc. However, I lack the knowledge of parallel algorithms. I had read some papers of parallel algorithm, but I don't know the underlying mechanism.

To parallel algorithms, I have a question. Does it really meaningful to use a parallel algorithm? The parallel computing can't solve the NPC problems. Does the required conditions of parallel algorithms practical? We use parallel algorithm for solving the large size problem, but parallel algorithm will require polynomial numbers of processors. E.g., there is a instance with n=1000 and a parallel algorithm for this problem by O(n^2) processors. It means this algorithm needs 1000000 processors. Is it practical? From the "Parallel algorithm" of Wikipedia :

Parallel algorithms are valuable because it is faster to perform large computing tasks via a parallel algorithm than it is via a serial (non-parallel) algorithm, because of the way modern processors work. It is far more difficult to construct a computer with a single fast processor than one with many slow processors with the same throughput. There are also certain theoretical limits to the potential speed of serial processors. On the other hand, every parallel algorithm has a serial part and so parallel algorithms have a saturation point (see Amdahl's law). After that point adding more processors does not yield any more throughput but only increases the overhead and cost.

The term "Parallel computing" contains more aspects of parallel algorithms, including models, programming issues, architectures, etc.

According to "Topics in Computer Science::Parallel Algorithm Design", the design of a parallel algorithm can be viewed as consisting of four stages - Partitioning, Communication Analysis, Granularity Control and Mapping.

"A Library of Parallel Algorithms" :
"Topics in Computer Science::Parallel Algorithm Design"
the design of a parallel algorithm can be viewed as consisting of four stages - Partitioning, Communication Analysis, Granularity Control and Mapping.
Designing and Building Parallel Programs, by Ian Foster
Designing and Building Parallel Programs is a book for students and professionals who need to know how to write parallel programs. It is neither a programming language manual nor an academic treatise on algorithms. Instead, it approaches parallel programming as an engineering activity, in which programs are developed in a methodical fashion and both cost and performance are considered at each stage in a design. It is intended as both an introduction to parallel programming and a practitioner's guide for programmers, engineers, and scientists developing programs for parallel and distributed computer systems.
A presentation, Introduction to Parallel Computing by Blaise Barney. This doc. inclined to implementation of parallel computing not theory. It take parallel programming as a technique of performance improvement.
Parawiki : This is a wiki dedicated to provide a single online resource for all questions regarding parallel computing.
Guy E. Blelloch and Bruce M. Maggs, "Parallel algorithms", ACM Comput. Surv., volume 28, number 1, 1996, pages 51--54.
MIT OpenCourseWare, 18.337J Applied Parallel Computing, (SMA 5505), Spring 2003
Design and Analysis of Parallel Algorithms, Murray Cole. The course note is course overheads.

2006-10-20 Update

Today my labmate told me a paper proved that one can adjust the parallel algorithms to a pratical version, i.e., there exist a modified parallel algorithm with time-complexity O(M/p+T) and n processors, if there is an parallel algorithm able to solve a problem in time T and computations M.

Ref. : R.P. Brent, "The Parallel Evaluation of General Arithmetic Expressions," J. ACM 21 (1974), 201--206.

It is shown that arithmetic expressions with n ? 1 variables and constants; operations of addition, multiplication, and division; and any depth of parenthesis nesting can be evaluated in time 4 log2n + 10(n - 1)/p using p ? 1 processors which can independently perform arithmetic operations in unit time. This bound is within a constant factor of the best possible. A sharper result is given for expressions without the division operation, and the question of numerical stability is discussed.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-04-02 04:12:01

Theory Lunch

I found a term "Theory Lunch" on webpages of many TCS groups. It's interesting but I didn't know its meaning. This little question suspended in my mind for a long time.

Now I find a explanation of it from Luca Trevisan's webpage.

Once a week, the Berkeley theory community gets together, socializes, has lunch, and then listens to an informal blackboard presentation on open questions, new results, old results worth not forgetting, or whatever is fit to entertain a crowd of theoreticians.

At the theory lunch, we do not believe in using slides, waiting until the end to ask questions, or stopping the speaker when he or she runs out of time.

Posted by AbnerCYH | Permanent Link

2006-04-02 03:20:32

Had dinner with a old friend last night

Last night, I had dinner with a old friend, Pin Sang Huang who is my classmate in my senior high school. He is master program student of Institute of Electro-Optical Science and Engineering, National Cheng Kung University.

In my memory, last time I met him is 3 or 4 years ago. We were really good friends at senior high school, so that we have a lot of things to talk. We had fun, but the fact that I am a theorist (more precisely, I want to be a theorist) surprised him.

We changed very much. He had told me that he wnat to be a doctor, but he became a student of EE, not bio-science related fields. I had loved physics, but I am a student of CS now.

Posted by AbnerCYH | Permanent Link

2006-04-01 16:48:17

Note on Lab Meeting 20060331

Actually, I forget the name of tonight's paper. I remember that it use SOM to solve HP-Model. I don't be interested in its details.

However, It's the first time that I heard the idea of self-organizing map (SOM).

The SOM algorithm is fed with feature vectors, which can be of any dimension. In most applications, however, the number of dimensions will be high. Output maps can also be made in different dimensions: 1-dimensional, 2-dimensional, etc., but most popular are 2D and 3D maps, for SOMs are mainly used for dimensionality reduction rather than expansion.

The interesting example of this seminar is solving the Traveling Salesperson Problem by SOM. I found a web page about it, "Traveling Salesman Problem by Neural Approaches". It briefly introduce the neural computation methods on TSP.

Like most artificial neural networks, the SOM has two modes of operation:

1. During the training process a map is built, the neural network organises itself, using a competitive process. The network must be given a large number of input vectors, as much as possible representing the kind of vectors that are expected during the second phase (if any). Otherwise, all input vectors must be administered several times...

2. During the mapping process a new input vector may quickly be given a location on the map, it is automatically classified or categorised. There will be one single winning neuron: the neuron whose weight vector lies closest to the input vector. (This can be simply determined by calculating the Euclidean distance between input vector and weight vector.)

Ref. Self-organizing map - Wikipedia, the free encyclopedia

The more details of SOM on TSP refer to "Self-organizing feature maps and the travelling salesman problem", Bernard Ang��niolCorresponding Author Contact Information, Ga��l de La Croix Vaubois and Jean-Yves Le Texier, Neural Networks, Volume 1, Issue 4, , 1988, Pages 289-293.

An approximate tour in our approach is given by a set of nodes joined together in a one dimensional ring, evolving in a continuous manner towards the ultimate solution path.

An intuitive view of the algorithm is as follows, An iteration step consists in the presentation of one city. The node closest to the city being presented moves towards it and induces its neighbors on the ring to do so as well, but with a decreasing intensity along the ring. This correlation between the motion of neighbor nodes intuitively leads to a minimization of the distance between two neighbors, hence giving a short tour. The nodes will progressively become independent from one another, and, eventually, each will attach to one city.

This approach seems intuitive, however it is more like concept of "force-balanced system". What's the core-idea of SOM?

Ref. :Neural Nets: Dr K Gurney, section 7 Competition and self-organisation: Kohonen nets, Kohonen's self-organising feature maps

The algorithm

At each vector presentation the following sequence of steps occurs

* Find the node k whose weight vector is closest to the current input vector.
* Train node k and all nodes in some neighbourhood of k.
* Decrease the learning rate slightly
* After every M cycles, decrease the size of the neighbourhood

Advanced Reading:(for myself)
Ref. : Gee, A.H.; Prager, R.W., "Limitations of neural networks for solving traveling salesman problems," Neural Networks, IEEE Transactions on , vol.6, no.1pp.280-282, Jan 1995

Ref. : Smith, K.; Palaniswami, M.; Krishnamoorthy, M., "Neural techniques for combinatorial optimization with applications," Neural Networks, IEEE Transactions on , vol.9, no.6pp.1301-1318, Nov 1998

Posted by AbnerCYH | Permanent Link | Categories: Notes, Seminar