August 2006 Archives

2006-08-31 03:46:56

Graphviz

4

I am preparing slides of my thesis proposal, so that I would like to draw a diagram to collect the graph classes such as Complexity Zoology. The first tool I thought is Dia, however, for such simple job, I don't think Dia is convenient enough, therefore I tried Graphviz.

However, I failed to install Grappa, ZGRViewer, KGraphViewer, and yapgvb in my Linux. Yes, I am a user of linux, but I am an end user. The Ajax/Graphviz works, but it seems not suitable for larger graph. (above graph is not accepted by it) DOTTY is another tool, that works on my PC, but it couldn't export as png, jpeg, gif, etc.
  • pydot, a Python interface to Graphviz's Dot language.
  • Ladot: LaTeX in your Graphviz dot files
Update:
I use Ladot to convert DOT language to ps file and then use GIMP convert it to PNG! It works! Thx for those hackers!

Posted by AbnerCYH | Permanent Link

2006-08-30 23:14:15

The Drawback of Lab

me

My Lab is like usual office. There is no persoanl office room, every member has a desk in an open space like above picture.

The main drawback for me is not exposing persoanl privacy. It is that I can't work alone. Sometime people want to be away from the social network.

Posted by AbnerCYH | Permanent Link

2006-08-30 03:01:06

Topological Methods

I used this name like probabilistic methods, I referred it to topological methods on combinatorics and computer science.

Recent years, the randomized algorithms and probabilistic methods got a lot of attentions of computer scientists, but the topological method seems not so popular to computer scientists. However its ideas and techniques have exactly been infiltrated into many different fields. The most familiar ``topology'' for CS people might be Network Topology, however most people use this term for presenting the graph-theoretic properties not for topological properties.

Lecture Notes, Survey, and Talks slides Personally I saw the topological terms on the field linear programming most frequently. I don't know the whether there exist other applications on computer science, but I really want to know.

References
  1. Lovasz, L. Topological and Algebraic Methods in Graph Theory, Bondy, J.A. & Murty, U.S.R. (ed.) Graph theory and related topics: Proceedings of the conference held in honour of Professor W.T. Tutte on the occasion of his sixtieth birthday, University of Waterloo, July 5-9, 1977 Academic Press, 1979
  2. Using the Borsuk-Ulam Theorem: Lectures on Topological Methods in Combinatorics and Geometry, by Jiri Matousek with the collaboration of Anders Bjorner and Guenter M. Ziegler, Springer (Heidelberg), April 2003
  3. Turaev, V. Lectures on Topology of Words, 2006.

    We discuss a topological approach to words introduced by the author. Words on an arbitrary alphabet are approximated by Gauss words and then studied up to natural modifications inspired by the Reidemeister moves on knot diagrams. This leads us to a notion of homotopy for words. We introduce several homotopy invariants of words and give a homotopy classification of words of length five.

  4. L. Lovasz, A homology theory for spanning trees of a graph, Acta Math. Acad. Sci.
I barrowed the first book from the library of NCU, but I can't understand any page. Lovasz proved some graph theory theorem by Bolzano-Weierstrass theorem. I even couldn't image how to use continuous mathematics to prove discrete mathematics. It is really hard for me. (Lovasz, L. A homology theory for spanning trees of a graph. Acta Math. Acad. Sci. Hungar. 30 (1977), no. 3-4, 241--251.) Maybe I should read his lecture note ``Discrete and Continuous: Two sides of the same?''

Update 2007-03-26:

Suresh wrote some posts on his blog about the discrete and the continuous, like Geometry and Combinatorics.

It is also a statement that encapsulates the primary source of tension and profundity in this field, namely the interplay between the discrete (algorithms) and the continuous (geometry).

There is a deeper level of interplay between the discrete and the continuous in geometric algorithms though. In a sense, some of the best theory in computational and combinatorial geometry has come from trying to abstract out a combinatorial structure that describes a geometric setting, and then using that combinatorial structure to prove a desired result.

And in his another post, Order polytopes, he talked more about it:

Combinatorics and topology have strong connections, and one doesn't have to go too far from home find instances of "lifting" a combinatorial problem to a topological domain to solve it.

He also recommanded the book, Matousek's Lectures on Discrete Geometry, as a wonderfully readable source of Discrete Geometry.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-08-26 18:21:35

Gestatation of My Thesis Proposal

I am gestating a proposal for my thesis.

Yesterday, my fellow gave a talk for his topic of thesis to my advisr on the Lab meeting. He did a nice proposal. He would like to study that generate XXX graph at random uniformly, where XXX means a name of graph class, such as chordal, perfect, simple, etc.

I foucs on the algorithm design on chordal graph. Actually, I don't care the type of algorithms, i.e., approximation, randomized, etc. Those are not like parallel algorithms, distributed algorithms, that need totally different paradigms. Those sub-fields do not restricted in a general constraint. If I want to takle a problem, then I will try to design a polynomal time deterministic or randomized algorithm at first, and I will try to devise (randomized) approximation algorithms if the problem is in NP-C.

Posted by AbnerCYH | Permanent Link

2006-08-25 05:00:00

Graph Decomposition

Graph decomposition is an important techniques to cope the problem on graph theory, furthermore it become a important and vivid subfield.

I notice this topic because tree-decomposition, such as path-width, tree-width, however, there are many interesting decomposition techniques like modular decompositions, clique decompositions.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-08-25 00:33:20

Algorithmic Graph Theory

The introductory page of Wikipedia is good enough, so that it will be meaningless to collect links again. I just summary something useful to myself.

Lecture Notes: Resources:
  • BHOSLIB: Benchmarks with Hidden Optimum Solutions for Graph Problems (Maximum Clique, Maximum Independent Set, Minimum Vertex Cover and Vertex Coloring)
  • PADS: A library of Python Algorithms and Data Structures implemented by David Eppstein of the University of California, Irvine. It contains some codes of graph algorithms, such as LexBFS, MST, etc.
  • Graph Classes in ISGCI
  • LEDA - A Library of Efficient Data Types and Algorithms
  • ``TreewidthLIB,'' A benchmark for algorithms for Treewidth and related graph problems.


Sub-topics: Graph Decomposition |

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-08-24 05:40:15

Recent days

In recent days, the most important news is the winners of Fields and Nevanlinna Prizes in this year. Terence Tao, one of winners of Fields medal, is an outstanding mathematician and he also did many good jobs on combinatorics; Jon Kleinberg received the Nevanlinna Prize because of contributions on network analysis and algorithms.

Personally, I tried to understand Uriel Feige and Joe Kilian's paper ``Zero Knowledge and the Chromatic Number.'' It is hard to me, however I have got some senses on it. I hope I could finish it before 9/8, because I need to report it on that day.

The difficulties of understanding U. Feige's paper to me is that the many backgrounds such as fractional graph theory and abstract of PCPs. The ignored parts of this paper make me confused, however, in the other hand, this paper gave me a road-map to catch up.

For example, I found Fractional Graph Theory is interesting. I would like to read some chapters of ``Fractional Graph Theory: A Rational Approach to the Theory of Graphs,'' by Edward R. Scheinerman, Daniel H. Ullman, after I finsih U. Feige's paper.

My friend will report Irit Dinur's ``The PCP Theorem by gap amplification.'' on LEAST seminar 8/25. I will spend hours to preview this.

Posted by AbnerCYH | Permanent Link

2006-08-21 16:31:42

Fixed Point Theorem

Note: Actually I focused on "Discrete" Fixed Point Theorem (, or Discrete Zero Point). Since game theory interests more and more computer scientists recent years, this topics seem to get more attentions.

Ref. : Fixed-point theorem - From Wikipedia, the free encyclopedia

The first time I learn the idea of fixed point theorem is in the course of advanced calculus (principle of mathematical analysis). After that, I have interest of it and feel it probably useful for algorithmics.

Besides, there is an interesting lecture note, CS294 Seminar on Algorithmic Aspects of Game Theory, by Christos H. Papadimitriou. It contains fixed point theorem. It is what I want to read!

I also knew some new researches on Fixed Point Theorem on COCOON 2006. I will survey it later.

2006-10-26 Update: Mark C. Chu-Carroll wrote an article about Fixed Point Theorem and Connected Topology.

Posted by AbnerCYH | Permanent Link | Categories: Notes

2006-08-19 09:17:34

COCOON'06

Ref. :The Twelfth Annual International Computing and Combinatorics Conference (COCOON'06), Taipei, Taiwan, August 15-18, 2006

In fact, I only attended the conference on August 16 and 18, although the conference held for four days. The first day, the invited talker is Franco P. Preparata. It is a familiar name to me, but I don't really know the old man. However a friend told me that prof. Preparata is the adviser of director of Institute of Information Science Academia Sinica, Der-Tsai Lee. And I found prof. Xiaotie Deng is prof. Christos H. Papadimitriou's student.

COCOON is a good and famous international conference in my field, however it surprised me that there are only about 70 peresons attending this conference.

IMG_0804_s

DSC_1472_s

Posted by AbnerCYH | Permanent Link

2006-08-11 15:43:58

Computational Biology Problems on Graph Representations

Since I was an member of NSC project on Graph Domination, I wanted to find some graph problems on computational biology so that I could combined personal interests and interests of Lab, e.g., ``Randomization and Feedback Properties of Directed Graphs Inspired by Gene Networks,'' M. Cosentino Lagomarsino, B. Bassetti, P. Jona. To appear in: Proceedings of the CMSB conference, Lecture Notes in Bioinformatics, Springer, 2006. However, the problem of this paper seems not so meaningful on computational biology, and its approach is complex science style.

Ref. : WF Lu and WL Hsu "A Clustering Algorithm for Interval Graph Test on Noisy Data," Experimental and Efficient Algorithms: Second International Workshop, WEA 2003, Lecture Notes in Computer Science, 2647, pages 195-208, January 2003.

Four typical models have been proposed for dealing with errors. The definitions are as follows: 1) interval graph completion problem: assume the input data only contain FNs and minimize the number of edges whose addition makes the graph an interval [KS1996, KST1999, NS1998]. 2) interval graph deletion problem: assume there are only FPs in the input data, and minimize the number of edges whose deletion makes the graph an interval graph [GGKS1995]. 3) interval sandwich problem: assume that some pairs of clones are definite overlaps, some are definite non-overlaps, and the rest are unknown, then construct an interval graph under these overlapping constraints [GKS1994], [KS1999]. 4) intervalizing k-color graph problem: assume that clones are created from k copies of DNA molecule, and some pairs of clones are definite overlaps. The objective is to generate a k-colorable interval graph with the overlapping conditions [GKS1994], [GGKS1995], [FHW1993], [BF1996].

Our philosophy is that, in order to determine whether certain overlapping information is valid or noisy, we check the neighborhood data to see if it conforms ``approximately'' to a particular local structure dictated by the problem.

However those models are in NP-H and the optimal solution might not make any biological sense. The authors gave the experimental results instead of theoretical analysis of time complexity, but the performance criteria of algorithm are number of jump intervals based on synthetic data not running time. This problem, how to chose the experimental benchmark of graph algorithm, interests me for a long time.

For viewpoint of algorithmics, Alon and Shapira's paper ``A characterization of the (natural) graph properties testable with one-sided error,''

More importantly, as a special case of our main result, we infer that some of the most well studied graph properties, both in graph theory and computer science, are testable with one-sided error. Some of these properties are the well known graph properties of being perfect, chordal, interval, comparability and more. None of these properties was previously known to be testable.

``A polynomial time recognition algorithm for probe interval graphs,'' Julie L. Johnson and Jeremy P. Spinrad, SODA '01.

Probe interval graphs were introduced to model a problem arising in a form of DNA sequencing. This paper presents an $O(n^2)$ algorithm for recognizing probe interval graphs. This is the first polynomial time recognition algorithm for this class.

Probe interval graphs is defined as

A graph is a probe interval graph if its vertex set can be partitioned into two sets, probes (P) and non-probes (N), such that N is independent and new edges can be added between non-probes in such a way that the resulting graph is a interval graph.

Another problem on interval graphs is testing for the consecutive ones property. Booth and Lueker have given a linear time algorithm by PQ-tree. (Ref. Ron Shamir's lecture note, theorem 10.9 and 10.10 in page 124, 125 respectively. And page 133~134) The consecutive ones test is useful for physical mapping and DNA sequence assembly, for example, in the STS content mapping of YAC library, and in the Bactig assembly based on STS as well as EST markers. Ref. ``A Test for the Consecutive Ones Property on Noisy Data Application to Physical Mapping and Sequence Assembly,'' Wei-Fu Lu, Wen-Lian Hsu Journal of Computational Biology 2003 10:5, 709-735

Molecular Networks:

Posted by AbnerCYH | Permanent Link | Categories: Papers, Notes

2006-08-10 11:10:16

A Brief Summary of Workshop on BioAlgorithmics

The activities we attended are Workshop on BioAlgorithmics (3 days) and RECOMB Satellite Workshop on Regulatory Genomics (2 days). I selected some interesting talks of them, but those selected talks are restricted to my interests and background.

  • Session 2, Invited Talk: The similarity metric and the distance metric. Kaizhong Zhang, University of Western Ontario

    Actually, many works in this workshop are just presenting an alternative metric for estimating similarity, i.e., Information Content, density, ..., and some other mathematical formula. This paper discuss the how to speed up the computations of distance metric. Those solutions could be divided into two class, Distance-based data structure (MVP-tree, M-tree, etc.) and Metric embedding. This paper focus on Metric embedding.

  • Session 2, Talk #1: Motif discovery from both ends: combinatorial and probabilistic, Mehmet M. Dalkilic, Indiana University, USA

    iGibbs: A Motif Discovery Framework for Gibbs Sampling Algorithm, to appear Proteins: Structure, Function, and Bioinformatics (2006)

    Recently I am specially interested in Gibbs Sampling. The Gibbs Motif Sampler Homepage, iGibbs.

    Siddharthan R, Siggia ED, van Nimwegen E (2005) PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny. PLoS Comput Biol 1(7): e67

    NEUWALD, A. F., LIU, J. S., LAWRENCE, C. E., Gibbs motif sampling: Detection of bacterial outer membrane protein repeats, Protein Sci 1995 4: 1618-1632

    Lawrence et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science. 1993 Oct 8;262(5131):208-14.

    George Casella and Edward I. George. "Explaining the Gibbs sampler". The American Statistician, 46:167-174, 1992.

    A Brief Overview of Gibbs Sampling. Eric C. Rouchka. Washington University. Institute for Biomedical Computing. Statistics Study Group. May 20, 1997.

  • Homology search: The more recent DNA homology search software PatternHunter improves BLAST with the new `optimized spaced seed' idea. A spaced seed S is a binary string. Its length is |S|, while its weight is its number of ones. A 1 in the seed S corresponds to a position with a required match and a 0 means "don't care."

    PatternHunter used weight-11 spaced (multiple) seeds by default. For example, if the seed 111010010100110111 is used, then two length-18 DNA strings must match at the `1' positions in order to be regarded as a hit, and all the hits can be efficiently found by constructing an index table of the subject sequence.

    This is another topic interest me.

  • An Application of Integer Linear Programming in Network Analysis, William Y.C. Chen, Andreas W. M. Dress, Winking Q. Yu, Director of PICB China.

    Actually this paper is accepted by ICCSB 2006. This paper is discuss the idea ``community'' of network. ``Community'' is used here to refer to those subsets of the vertex set that have a higher density of edges within their interior than along their boundary. Observe that identifying a "community structure" in a network is nothing but adding and eliminating edges in a somehow "most parsimonious" way so that the network becomes a disjoint union of cliques.

The topics I summaried here are part of topics I reported on the meeting. There are two topics interest me after I attended the workshop, i.e., Homology search and Gibbs Sampling.

Posted by AbnerCYH | Permanent Link | Categories: Notes, Seminar

2006-08-10 09:06:17

Two Meetings Yesterday

Yesterday I attended the two meetings. The first one is travelogue about workshop on bioalgorithmics and RECOMB on Singapore. The visiting assistant professor Wing-Kai Hon attended the meeting, too. Prof. Hon is really an original researcher. He could give comments very quickly, and these comments are useful and meaningful.

The second one is Dr. Gung-Wei Chirn's talk with topic ``Gene set enrichment analysis (GSEA) on Schizophrenia brain tissue gene expression data.'' Dr. Chirn is a senior researcher of Novartis Institutes for Biomedical Research, USA. He is an expert on bioinformatics.

The Gene Set Enrichment Analysis (GSEA) is interesting.

Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes).

I think that bioinformatics is really useful, but we might understand too less biology to help biologists.

Dr. Chirn said that for helping biologists he need to know the hypothesis, that biologists couldn't say but understand, therefore he will read many papers quickly. Since Dr. Chirn has a master degree of biology, and he emphasize on the word ``quickly,'' so that we all thought that it might cost several weeks. Nevertheless, he explain later the word ``quickly'' means it might cost couple of months!!! We are too naive on that!

Posted by AbnerCYH | Permanent Link

2006-08-09 23:24:39

COCOON06 Registration

The Twelfth Annual International Computing and Combinatorics Conference COCOON'06. Taipei, Taiwan, August 15-18, 2006.

I registered this conference as the local student. It costed $30. The followings which I interested in are;
  • August 16, 10:20 am - 12:00 noon: Session A, A Simplicial Approach for Discrete Fixed Point Theorems.
  • August 16, 3:30 - 4:45 pm: Session B Graph Theory, Creation and growth of components in a random hypergraph process
  • August 17, 9:00 - 10:15 am: Session A, Finding Patterns with Variable Length Gaps or Don't Cares.
  • August 18, 10:20 am - 12:00 noon: Session B, Approximating min-max (regret) versions of some polynomial problems
  • August 18, 10:20 am - 12:00 noon: Session B, MAX-SNP Hardness and Approximation of Selected-Internal Steiner Trees
  • August 18, 10:20 am - 12:00 noon: Session B, Minimum Clique Partition Problem with Constrained Weight for Interval Graphs
  • August 18, 1:30 - 2:45 pm: Session B, On the Combinatorial Representation of Information
  • August 18, 3:05 - 4:20 pm: Session A, Lower Bounds on the Approximation of the Exemplar Conserved Interval Distance Problem of Genomes
  • August 18, 3:05 - 4:20 pm: Session B, Effectiveness of the Linear Programming Relaxation of the 0-1 Multi-commodity Minimum Cost Network Flow Problem.
Besides the experience of attending workshop on bioalgorithmics, it is the first time I attend the international conference. I hope next time I will present on an international conference:)

Posted by AbnerCYH | Permanent Link

2006-08-03 04:40:00

Approximation Algorithms and Combinatorial Optimization

Courses

Related notes on my blog: [Note on Linear Programming], [Optimization Theory].

Posted by AbnerCYH | Permanent Link | Categories: Notes