September 2006 Archives
2006-09-30 09:06:06
Ten Simple Rules for Getting Published
-
Read many papers, and learn from both the good and the bad work of others.
The more objective you can be about your work, the better that work will ultimately become.
Good editors and reviewers will be objective about your work.
If you do not write well in the English language, take lessons early; it will be invaluable later.
Learn to live with rejection.
The ingredients of good science are obviousˇXnovelty of research topic, comprehensive coverage of the relevant literature, good data, good analysis including strong statistical support, and a thought-provoking discussion. The ingredients of good science reporting are obviousˇXgood organization, the appropriate use of tables and figures, the right length, writing to the intended audienceˇXdo not ignore the obvious.
Start writing the paper the day you have the idea of what questions to pursue.
Become a reviewer early in your career.
Decide early on where to try to publish your paper.
Quality is everything.
2006-09-30 08:51:14
Ten Simple Rules for Reviewers
This issue of PLoS Computational Biology has two interesting articles to me. One of them is ``Ten Simple Rules for Reviewers.''
-
Do Not Accept a Review Assignment unless You Can Accomplish the Task in the Requested TimeframeˇXLearn to Say No. However it seems not easy to say no for chinese professor, I think the chinese culture made people be used to say yes for others, especially the powerful people's requests
Avoid Conflict of Interest.
Write Reviews You Would Be Satisfied with as an Author
As a Reviewer You Are Part of the Authoring Process
Be Sure to Enjoy and to Learn from the Reviewing Process. It is best case, but if this request was from your adviser then you might not have such many choices. ^_^;;;
Develop a Method of Reviewing That Works for You. I am convieced it is helpful, however I don't know how to do.
Spend Your Precious Time on Papers Worthy of a Good Review
Maintain the Anonymity of the Review Process if the Journal Requires It
Write Clearly, Succinctly, and in a Neutral Tone, but Be Decisive.
Make Use of the ``Comments to Editors''
References
-
Bourne PE (2005) Ten simple rules for getting published. PLoS Comput Biol 1(5): e57. DOI: 10.1371/journal.pcbi.0010057.
Bourne PE, Chalupa LM (2006) Ten simple rules for getting grants. PLoS Comput Biol 2(2): e12. DOI: 10.1371/journal.pcbi.0020012.
2006-09-26 09:56:27
Communications of the ACM, Volume 49, Number 9, 2006
A Wiki for discussing and promoting best practices in research
The result of our work combined effort with many other ACM and IEEE leaders as a Wiki (wiki.acm.org/healthcc) for presenting and discussing ideas and responses to this growing challenge.
Wow! The professors seems interested in Web 2.0!The business of software: Software: hard data, Phillip G. Armour
Productivity is Getting Lower
One of the most interesting things QSM found was that productivity of projects, after rising steadily and quite predictably for the 15 years from 1982 through 1997, has declined markedly since then. The primary productivity factor they tracked (a factor called "Productivity Index" or PI) is now around the same level as it was back in the late 1980s. It might be enlightening to postulate what reasons there could be for this decline.
I would like to invite reader comments on and interpretations of these results and if they match other observations and measurements. It would interesting if we, as a profession, were to think how we could measure the reasons for these findings, and understand what these critical trends actually mean to our business.
:)
2006-09-25 20:27:20
LaTeX on Webpage
I also want to get the LaTeX support in my blog, however MediaWiki is to heavy for personal usage. MimeTeX, licensed under the GPL, could easily embed LaTeX math in html pages without installing LaTeX, i.e.,
Note: For my server, it need to change ``../cgi-bin/mimetex.cgi?x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}'' to ``../../cgi-bin/mimetex.cgi?x=\frac{-b\pm\sqrt{b^2-4ac}}{2a}''
Note: Since the pages in Archives have different path, therefore the path ``../cgi-bin/mimetex.cgi?'' should be changed to ``http://algorithm.cs.nthu.edu.tw/cgi-bin/mimetex.cgi?''
LatexRender seems more powerful, but it need to install more softwares on server.
Ref.: An article on Hongbo's Blog.
2006-09-25 01:14:41
American Mathematical Society :: Feature Column
It is very interesting column archive of AMS, some articles interested me. I listed them as following.
-
Oriented Matroids: The Power of Unification
Matroids: The Value of Abstraction
Variations on Graph Minor
Trees: A Mathematical Tool for All Seasons
Simple Chaos - The Henon Map
Topology of Venn Diagrams
Bin Packing
Bin Packing and Machine Scheduling
Complex Networks
A Discrete Geometrical Gem
Nets: A Tool for Representing Polyhedra in Two Dimensions
Polyhedra
Turing Machines
2006-09-24 22:32:09
NCTU EE5065 Optimzation Theory Note 1
This course is engineering style. Prof. Lin started from the chpater six of Zak's book, and he didn't explain some mathematical bases of optimization theory.
Some References:
-
I had collected some links about Optimzation Theory.
Optimization by Vector Space Methods, by David G. Luenberger
A First Course in Optimization Theory by Rangarajan K. Sundaram. There are many palces written ``See Rudin page xxx''....:) However, this book is written for student of economics, therefore many exapmles are about economics. It is another advantage of it to me.
Numerical Optimization, by Jorge Nocedal , Stephen Wright
Numerical Methods for Unconstrained Optimization and Nonlinear Equations, by J. E., Jr. Dennis, Robert B. Schnabel
Optimization in Economic Theory, by Avinash K. Dixit
Concepts of Geometry and Analysis:
-
Hyperplanes: I have a little difficulty to image the n-D space for n>3. An hyperplane is
-
If f is differentiable at x, then all partials
Let
There is a little difference between Optimization Theory and Analysis of Algorithms, i.e, small o notation o(). In Optimization Theory,
e.g.,
In Analysis of Algorithms,
e.g.,
Ref. Paul Walton Purdom, Jr., Cynthia A. Brown's The analysis of algorithms. And Edwin K.P.Chong and Stanislaw H.Zak's ``An Introduction to Optimization.''
The Big-O notation is different, too. The idea is similar, but in Optimization Theory they care the behavior of function as x approximating 0. According to Wikipedia,
There are two formally close, but noticeably different usages of this notation: infinite asymptotics and infinitesimal asymptotics. This distinction is only in application and not in principle, howeverˇXthe formal definition for the "big O" is the same for both cases, only with different limits for the function argument.
The Infinite asymptotics is like the Big-O notation in Analysis of Algorithms, the Infinitesimal asymptotics is like the Big-O notation in Optimization Theory.There are a lot of Matrix Calculus in this fields, however, I don't understand gow to do this. e.g.,
I collected some links about it, but they are references and lack details.
-
Notes on Matrix Calculus, by Paul L. Fackler. It is good.
Matrix calculus - Wikipedia, the free encyclopedia
The Matrix Reference Manual: Matrix Calculus, by Mike Brookes, Imperial College, London, UK.
We say that an iterative algorithm is globally convergent if for any arbitrary starting point the algorithm is guaranteed to generate a sequence of points converging to a point that satisfies the FONC for a minimizer. When the algorithm is not globally convergent, it may still generate a sequence that converges to a point satisfying the FONC, provided the initial point is sufficiently close to the point. In this case, we say that the algorithm is locally convergent.
Methods of Unconstrainted Optimization:
-
1-D: Most of problems are not in this case, however we could reduce objective function to 1-D search sometimes.
-
Golden section: reduce the uncertainity interval with the Golden Section ratio 0.382. The idea is to save the calculation of section points, i.e., take one of k-th section points as the one of (k+1)-th section points.
Fibonacci search: reduce the uncertainity interval with different ratios. According the proof, the ratio will be a function of Fibonacci series. The idea is to accelerate the reduction of uncertainity interval.
Newton's method: Need the first and second derivatives. It use the minimum of Tylor approximation f(x)=f(x')+f'(x)(x-x')+f''(x)((x-x')^2)/2. It only needs one starting point.
Secant method: It needs two starting points. It used the positions and derivates of such two points to compute the slope of first-order derivate for approximating the second order derivative in Newton's method.
-
Steepest descent algorithm: Use the first and second derivatives to find the Steepest descent feasible direction d and maximum feasible step size a for f(x')=f(x-ad).
Fixed step size gradient algorithm: Like Steepest descent algorithm but adapt the fixed step size. It is much simpler than steepest descent algorithm in practical, however the algorithm will converge iff. the step size
Ref.: David G. Luenberger's ``Linear and Nonlinear Programming,'' pp. 218, 219 for an alternative approach to the convergence ratio analysis. Ref.: Gradient descent - Wikipedia, the free encyclopedia
Newton's method (sometimes called the Newton-Raphson method): Gradient method might not efficient enough for solving problems. An extension of Newton's method will provide faster convergence ratio if the starting point is close to the x^* enough. Thus it also need to some convergent analysis.
It construct a quadratic approximation to the objective function that matches the first and second derivative values at that point. We then minimize the approximate (quadratic) function instead of the original objective function.
The procedure is that firstly solve
Therefore, for the quadratic case, the order of convergence of Newton's algorithm is infinity for any initial point. However, generally
Ref. :Newton's method in optimization - Wikipedia, the free encyclopedia
Modified Newton's method: The formula to compute the next point of Newton's method is
Levenberg-Marquardt modification to Newton's algorithm: (LMA)
Above algorithm needs that Hessian martix of f is positive definite to guarantee the descent property. If the Hessian matrix is not positive definite, then the search direction may not point in a descent direction. A simple technique to ensure that the search direction is a descent direction is to introduce the so-called Levenberg-Marquardt modification to Newton's algorithm:
The idea is to make the 2ed derivative positive definite.
If
Ref. : Levenberg-Marquardt algorithm - Wikipedia, the free encyclopedia.
Ref.: Gauss-Newton algorithm: (GNA) - Wikipedia, the free encyclopedia.
Ref.: Jacobian Matrix- Wikipedia, the free encyclopedia
Conjugate gradient method From Wikipedia, the free encyclopedia.
Origianlly the conjugate gradient method is an algorithm for solving particular systems of linear equations, namely those whose matrix is symmetric and positive definite, i.e.,
The conjugate vectors are linear independent, and the size of S equals n or the number of eigenvalues of A. (It also guanratee to arrive the optima in n steps.) In conjugate gradient method, for quadratic function,
If we trnasformed the objective function as
A systematic procedure for finding Q-conjugate vectors can be devised using the idea underlying the Gram-Schmidt process of transforming a given basis of R^n into an orthonormal basis of R^n.
The conjugate gradient algorithm without prespecified conjugate directions: It computed the i-th Q-conjugate direction in each iteration. The precedure is like the original one, but it computed the (i+1)-th conjugate direction
(1)
However, it is for a positive definite quadratic function of n variables. For the quadratic function that Hessian is unknown, there are three well-known modifications to eliminate Q in computing
-
Hestenes-Stiefel formula: Subsitute the Q in
The algorithm can be extended to general nonlinear functions by interpreting f(x) as a second-order Taylor series approximation of the objective function.
For nonquadratic problems, the algorithm will not usually converge in n steps, and as the algorithm progresses, the "Q-conjugacy" of the direction vectors will tend to deteriorate. Thus, a common practice is to reinitialize the direction vector to the negative gradient after every few iterations (e.g., n or n + 1), and continue until the algorithm satisfies the stopping criterion.
Quasi-Newton Methods: Newton's Method has two common dracbacks, one is the convergence depending on the choosing of initial point, the other is that computing inverse matrix of Hessian costs too much computations.
The first Quasi-Newton Algorithm was proposed by W.C. Davidon, a physicist working at Argonne National Laboratory. Like the story of QuickSort, Davidon got the trouble of applying optimization algorithms on his research, thus he decided to improve such algorithm. Eventually he developed the first quasi-Newton algorithm in the mid 1950s. However, such seminal idea is not accpeted for publications, it remained as a technical report until 1991.
The most popular quasi-Newton algorithms are BFGS method and DFP algorithm. The rough decirption of quasi-Newton method as following,
Quasi-Newton methods are also conjugate direction methods when it applied on guadratic objective function with symmetric Hessian, hence a quasi-Newton algorithm solves a quadratic of n variables in at most n steps.
(Refer to Theorem 11.1 of Zak's textbook. Notice that textbook doesn't give the proof of case k=0. Accodring to lecture notes of Prof. Lin, his proof relied on the formula of \alpha_k of conjugate gradient algorithm. However that the Quasi-Newton method is equivalent to conjugate gradient algorithm is what we want to prove, I don't think that putting it into the proof is correct.)
The Newton' method use the inverse Hessian to compute the next point. Thus if we want to approaximate inverse Hessian, then we want it to behave more and more like inverse Hessian for each iteration, i.e, for Hessian Q and a given k, since
Because inverse Hessian also holds above equation, Therefore, if
Thus one should prove the
For the quadratic function with Hessian Q=Q^T, the followings are Quasi-Newton.
-
Rank one correction formula: start from symmetric positive definite matrix H_0. However, for nonquadratic objective function, H^i may not be positive definite and thus d^{i+1} may not be a descent direction.
DFP algorithm (variable metric algorithm): start from symmetric positive definite H_0. H_{k+1} inherits positive definiteness from H_k.
BFGS algorithm: The principal idea of the method is to construct an approximate Hessian matrix B_k not inverse Hessian H_k. In particular, the BFGS update for B_k corresponds to the DFP update for H_k. Formulas related in this way are said to be dual or complementary. the BFGS algorithm enjoys all the properties of quasi-Newton methods, including the conjugate directions property. Moreover, the BFGS algorithm also inherits the positive definiteness property of the DFP algorithm.
For nonquadratic problems, quasi-Newton algorithms will not usually converge in n steps. As in the case of the conjugate gradient methods, here too some modifications may be necessary to deal with nonquadratic problems. For example, we may reinitialize the direction vector to the negative gradient after every few iterations (e.g., n or n + 1), and continue until the algorithm satisfies the stopping criterion.
Constrained Optimization: See the other notes.
2006-09-23 20:06:25
NTHU EE6455 Advanced Computer Architecture
Actually it has nothing to do with my research, I take this course just because I had buy the textbook, and my background is EE.
Courses
-
MIT 6.823 Computer System Architecture - Fall 2006, Krste Asanovic [2005]
Paul Kelly: Advanced Computer Architecture
Berkeley CS252 Graduate Computer Architecture. Spring 2001. David Patterson
I am most interested in the Performance Evaluation. The authors of our textbook thought the best metric of performance is execution time, however there are various definitions for execution time, e.g., CPU time, response time, and many types of testing materials, i.e., real applications, modified application, kernels, toy benchmarks, synthetic benchmark. And there are different benchmarks for different needs, i.e., SPEC, TPC-H, and EEMBC for desktop, server, and embedded system respectively.
Actually, it is really difficult to analyse and compare the performance of different computer systems. There are too many chances to improve the performance by modifying the testing procedures, e.g., using the more powerful compiler. This condition also appeared in the experiments of algorithms.
I don't exactly know meaning of the ``summary performance.'' The arithmetic mean is intuitive, but harmonic means ,geometric means....
There are two approaches to summary the performance of individual testing program: average and normalization. The first one is intuitive for calculating the arithmetic mean of (weighted) execution times; the second one normalizes the execution times of machines to a reference, i.e., a specific machine, and calculate their means. The geometric mean might be more suitable for this case; it could eliminate the influence of reference choosing, since GM(A)/GM(B)=GM(A/B), nonetheless there are two main drawbacks, the first is that it can't predict the execution time, i.e., for some workloads, there are different execution times for the machines with same geometric mean measurement; the second one is that the same ratio improvement will get the same improvement of GM, therefore people will prefer to spend time on easier improved parts.
Hence the best solution still analyze the weights of individual testing programs of the real workload.
Lecture 2: Instruction Set Principles
Lecture 3: Pipelining
The instructor taught this, that is based on appendix A of textbook, because the follwoing chapters will heavily depend on pipelining.
Some mnemonics
-
``D'' is prefix or suffix for the extended 64-bits instructions of MIPS64, e.g., LD for load, SD for store.
``I'' is suffix for immediate version of instructions, e.g., ADDI for add.
``U'' is suffix for unsigned arithmetic instructions which do not generate overflow exceptions, e.g., DADDU for extended 64-bits add, however sometimes the notation will be modified slightly if cooperated with immediate version like DADDUI.
If clock cycle is as small as the sum of the clock skew and latch overhead, no further pipelining is useful, since there is no time left in the cycle for useful work.
(Pentium 4 vs. Pentium 3 met this problem.)The speedup equation of pipelining in this chapter is a little bit complicated. I prefer to undersatnd the meaning and deduce the needed equation immediately, but it might not good approach for exam.
In this book, the authors use ``exception'' for all other terms interrupt, fault, and exception.
Consider the branch and exceptions on pipelined machince. If exception occurs at the branch delay slot, it might got some problems such as "forget to do branch" after resuming, thus if exceptions occur, it might store the PCs of delay slots and the target PC of branch.
The ``precise exception'' is better and harder. However it is hard to do this in more complicated cases like float points for linger pipeline.
Exception order is not as the pipeline order
An instruction is called commited if it quaranteed to complete. In MIPS integer pipeline, all instructions are committed when they reach the end of the MEM stage or beginning of WB stage, and no instruction updates the state before that stage.
-
Structual Hazard: Since the difference of lengths of pipeline stages of instructions, it might occur that the memory access request at the same time.
WAW Hazard: The later instruction complete the WRITE before the former instruction completing the WRITE. (For the same target.)
WAR Hazard: Impossible since the register reads always in ID.
RAW Hazard: The later instruction complete the WRITE before the former instruction completing the READ.
For superpipelining, like MIPS R4000, designer divided the memory access into serval stages. Divide IF into two stages, and MEM into three stages (Combined with caches). In such case, the ``load'' and ``branch'' delay increased. By forwarding DS to EX, it still costed the 2 cycles for stalling, and `branch'' delay costed 3 cycles since the branch condition is determined at EX.
The MIPS architecture has a 1 cycle ``delay slot'', thus the MIPS R4000 uses a predicted-not-taken strategy for remaining 2 cycle of branch delay. Furthermore it needs more forwardings of R4000, i.e., EX/DF, DF/DS, DS/TC, and TC/WB.
MIPS design a new instruction ``Blez1'' (branch likely) that make delay slot be turned into no-op when branch is not taken. This is useful since compiler can't always find the instruction to insert into delay slot.
MIPS R4000 has 4 kinds of stall: Load stall, Branch stall, FP result stall (RAW for FP), FP structual stall. Thus the total pipeline CPI = pipeline CPI + Load stall + Branch stall + FP result stall + FP structual stall.
Lecture 5: Dynamic Scheduling -- Scoreboard: Scoreboarding is a technique for allowing instructions to execute out of order when there are sufficient resources and no data dependencies; it is named after the CDC 6600 scoreboard, which developed this capability. However, the scoreboard is limited in that it does not handle WAR and WAW hazards very well. (handle it by stalls)
-
Issue
Read operands
Execution
Write result
Lecture 6: Exploiting Instruction Level Parallelism: Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be performed simultaneously.
Dependency
-
True dependency: Data Hazard (RAW)
Name dependency: It is not real true data dependency. It could be solved by renaming.
-
Anti-dependency: WAR
Output dependency: WAW
Tomasulo Approach is another scheme to allow execution to proceed in the presence of hazards developed by the IBM 360/91 floating-point unit. This scheme combines key elements of the scoreboarding scheme with the introduction of register renaming.
-
Issue
Execution
Write result
The algorithm allows data forwarding wherever possible. (refer to The Scheduling Algorithm, however it seems not be mentioned by textbook.)
Two main adavantages of Tomasulo Approach:
-
register renaming
distributed control (by reservation stations)
Lecture 7: Dynamically Exploiting ILP (I)
Lecture 8: Dynamically Exploiting ILP (II)
2006-09-23 19:32:33
The Art of Uniformed Decisions
Like the online algorithms, ``property testing'' has its own framework of analysis, i.e., $\epsilon$-close or $\epsilon$-far. I don't think what the $\epsilon$ means is very important, because the metric function is quite different from various cases. Terms like testable are really matters.
Section 5 show that why we need to study Szemer\'edi's Regularity Lemma. However in the references, there are too much mathematical stuff. The first step might be [AFKS 2000] that the author said that it prove a variant of Regularity Lemma.
Section 8 discuss the non-testable property, it is interesting that they use the Yao's method to prove non-testability, and connect it to the section 5, i.e., `$\forall\exists$' type first order graph property.
2006-09-23 05:44:20
LEAST Seminar 20060922
Title: Faster algorithms for computing longest common increasing subsequences
Authors: Gerth Stolting Brodal, Kanela Kaligosi, Irit Katriel, and Martin Kutz
Source: 17th Annual Symposium on Combinatorial Pattern Matching (CPM06)
Speaker: Chieh Chin Chiu
Time: 9.22 (Friday) 19:00-21:00
Place: Room 550, Department of Computer Science, NTHU
This paper discuss the problem LCIS is to find the longest common increasing subsequences of two sequences A and B with length m, n respectively. The first algorithm is $O(mn)$ based on dynamic programming, and its space complexity could be reduced to $O(m)$ by divide-and-conquer approach, where $m>n$.
This paper also adopt the dynamic programming approach, but they use a special data structure, Van Emde Boas tree, to reduce the time complexity.
The authors defined a new data structure Bounded Heap with three operations, it could base on McCreight's Priority Search Tree, which costs $O(log n)$ each operation, however they chose the Van Emde Boas tree, because it supports fewer operations but costs only $O(log log n)$ each operation. And its operations are enough. Besides, it surprised us that this is invented in 1977. Prof. R. C. T. Lee commented on it: It is early age of CS. R. E. Tarjan's minimum spanning trees algorithm is in 1976....:)
The part I am most interesting in is the technique reducing the time complexity of dynamic programming. The algorithm has $n$-iterations, they compute length $i$ CIS in $i$-iteration by results of $i-1$-iteration.
According to the page of wikipedia or Erik Demaine's lecture note, the Van Emde Boas tree included the delete operation. Authors seem not exactly correct, or I misunderstand their expressions.
Van Emde Boas tree
The original paper is unavaliable. For applying this data structure to other problem, we should know the abstract problem -- fixed-universe successor problem, i.e., Maintain a dynamic subset S of size n from the universe
The idea of Van Emde Boas tree is to combine bit vector and binary search tree. It is defined recursively. Prof. Erik Demaine's lecture notes of Advanced Data Structures is very good reference, but I would like express it in the global view.
Actually it is heap, i.e., complete binary tree. The leaves is the bits to present the appearence of element in S, hence there are u leaves. Thus there are (log u) levels if we adopted the binary representation. Each internal node is 1 if one of its children is 1 or both are 1.
If we use the first half bits,
The details of operation definitions refer to Prof.Demaine's lecture notes. Furthermore it could reduce space complexity by using hash fuction.
Priority Search Tree
McCreight's Priority search tree is combination of priority queue and binary search tree.
Ref. :CSIE Geometric Computing and Visualization, Instructor : D. T. Lee
However, I can't see the difference between Priority search tree and binary search tree with geometic meaning in D. T. Lee's slides.
2006-09-22 03:46:53
Two papers captured my eyes
Ref. : Oleg Pikhurko, Helmut Veith and Oleg Verbitsky, The first order definability of graphs: Upper bounds for quantifier depth, Discrete Applied Mathematics, Volume 154, Issue 17, , 15 November 2006, Pages 2511-2529.
Another paper is Uriel Feige, Eran Ofek's ``Random 3CNF formulas elude the Lovasz theta function.''
I don't know when I first time hear Lovasz theta function, but I like it. It is about the Shannon capacity of the graph G.
Ref.: Lovasz, L., "On the Shannon capacity of a graph," Information Theory, IEEE Transactions on , vol.25, no.1pp. 1- 7, Jan 1979
2006-09-22 02:25:28
Parallel Architecture
Prof. Biing-Feng Wang teach ``Parallel Algorithm Design'' this semester.
PART I
1. Introduction: Classification of parallel computers .Performance of parallel algorithms
2. Shared-memory computers, basic techniques, and Brent Theorem
PART II
3. Tree Machine 4. Linear-processor array
5. Mesh-connected computers 6. Hypercubes
7. Perfect shuffles 8. Mesh-connected computers with multiple broadcasting
9. Processor arrays with reconfigurable bus systems
PART III
10. Systolic architectures 11. Wavefront array processors
12. Distributed algorithms
13. Fault-tolerance, randomized algorithms, P-completeness
References:
[1] J. JaJa, An introduction to parallel algorithms, Addison Wesley, 1992.
[2] S. G. Akl, The design and analysis of parallel algorithms, Prentice-Hall, 1989.
[3] Selected papers from journals and proceedings.
Some interconnection networks, e.g., arrangement graphs and hypercube-like graphs, interest me very much, they have pretty interesting properties.
Courses
-
MIT OpenCourseWare, EECS 6.895, Theory of Parallel Systems (SMA 5509), Fall 2003
MIT OpenCourseWare, EECS 6.896 Theory of Parallel Hardware (SMA 5511), Spring 2004 (Another Website)
Khaled Day and Anand Tripathi, Arrangement graphs: a class of generalized star graphs, Information Processing Letters, Volume 42, Issue 5, , 3 July 1992, Pages 235-241.
the arrangement graph, as a generalization of the star graph topology and prove many of its properties such as: hierarchical structure, vertex and edge symmetry, simple and optimal routing, and many fault tolerance properties. The arrangement graph presents more flexibility than the star graph in terms of choosing the major design parameters: degree, diameter, and number of nodes.
2006-09-18 15:25:09
Nonlinear Time Series Analysis
However, I found that there are less Internet resources explaining the idea of Nonlinear Time Series Analysis; there are many website talk about Time Series Analysis, but less about Nonlinear parts.
The Chaos Research Group
The Chaos Research Group studies deterministic chaos and nonlinear dynamics in engineering systems and is located within the College of Engineering at the University of Tennessee. This group closely collaborates with the researchers at the Oak Ridge National Laboratory.
They study some physical and mechanical phenomena. I can't understand most of them, but there are three topics specially interesting me, i.e. Internal combustion engines, Chaos control algorithms, Time-series analysis. Here I focus on Nonlinear time-series analysisThe process of analyzing time series using mathematical and numerical data transformations or even appropriate graphical displays constitutes a field of science known as time-series analysis. Conventional signal-processing techniques include Fourier transforms, autocorrelation functions and autoregressive data modeling, but these methods generally rely on linear relations and often have been found insensitive in describing the nonlinear structure in chaotic time series.
Chaotic time-series analysis (CTSA), or nonlinear time-series analysis (NTSA), refers to a class of data-analysis techniques employed to provide a richer description of chaotic time series. The CRG has been involved continuously in developing and applying new CTSA algorithms to time series, and current research topics include stationarity tests, dynamical state classification and periodic orbit detection.
Ben Goertzel's paper, Delay Vectors as Perceptual Chunks: Understanding Nonlinear Time Series Analysis as Self-Organizing Cognition, explain this by another viewpoint, but for me it is harder to understand.
I would like to ask: how do we decide to adopt Time Series Analysis or Nonlinear Time Series Analysis? Although most phenomena of real world are nonlinear, actually we could use many techniques to approximate them. Give some assumptions, ignore some variables, we could get pretty good results sometimes. What is the criteria of evaluating approaches?
Other Sites
-
Slides of lecture, Nonlinear Dynamics and Chaos, by Patrick McSharry
Modelling of Dynamical Systems & Time Series Analysis
Nonlinear Dynamics and Topological Time Series Analysis Archive. Nicholas B. Tufillaro collected some links and resources.
crg@utk: Links to related sites
Sci.nonlinear FAQ: [5.2] Where can I find specialized programs for nonlinear science?
2006-09-18 07:46:09
Data Analysis
However, I have some interests to learn it after watching Numb3rs....:)
My first question is the difference between Data Analysis and Data mining. According to Data analysis - Wikipedia, the free encyclopedia
Respect to Data mining, data analysis is usually more narrowly intended as not aiming to the discovery of unforeseen patterns hidden in the data, but to the verification or disproval of an existing model, or to the extraction of parameters necessary to adapt a theoretical model to (experimental) reality.
After taking a look into Wikipedia, I was surprised that so much sub-fields of statistics. However, I thought I should start from the terms taht are familiar to computer scientist.Data mining - Wikipedia, the free encyclopedia
Data mining (DM), also called Knowledge-Discovery in Databases (KDD) or Knowledge-Discovery and Data Mining, is the process of automatically searching large volumes of data for patterns such as association rules. It is a fairly recent topic in computer science but applies many older computational techniques from statistics, information retrieval, machine learning and pattern recognition.
Machine learning - Wikipedia, the free encyclopediaAt a general level, there are two types of learning: inductive, and deductive. Inductive machine learning methods create computer programs by extracting rules and patterns out of massive data sets. It should be noted that although pattern identification is important to Machine Learning, without rule extraction a process falls more accurately in the field of data mining.
Other fields about Data Analysis
-
Regression analysis
Analysis of variance
analysis of covariance:
an old-fashioned name for a linear regression model with one continuous explanatory variable and one or more factors. The name exists for historical reasons, but there is no particular reason to distinguish the method from the general purpose linear model.
Time series analysis Principal components analysisIn statistics, principal components analysis (PCA) is a technique for simplifying a dataset, by reducing multidimensional datasets to lower dimensions for analysis.
Technically speaking, PCA is a linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on.
Other Resources
-
Interactive Statistical Calculation Pages listed here comprise a powerful, conveniently-accessible, multi-platform statistical software package.
StatLib is a system for distributing statistical software, datasets, and information by electronic mail, FTP and WWW. Therefore it might be a good start point for learning data analysis.
2006-09-18 05:41:30
Neuroeconomics
Ref.: Neuroeconomics - Wikipedia, the free encyclopedia
Ref.: Neuroeconomics Blog - Center for the Study of Neuroeconomics at George Mason University
The Neuroeconomics Blog try to give it a definition:
Neuroeconomics is an interdisciplinary research program with the goal of building a biological model of decision making in economic environments. Neuroeconomists ask, how does the embodied brain enable the mind (or groups of minds) to make economic decisions? By combining techniques from cognitive neuroscience and experimental economics we can now watch neural activity in real time, observe how this activity depends on the economic environment, and test hypotheses about how the emergent mind makes economic decisions. Neuroeconomics allows us to better understand both the wide range of heterogeneity in human behavior, and the role of institutions as ordered extensions of our minds.
There is a note that seems talking about the same thing. Cosma's Notebook: Social NeuroscienceRef.: Neuroeconomics at Caltech - The Camerer Lab
2006-09-18 03:36:26
Primal-Dual Scheme
Originally Primal and Dual method is another method solving linear program. In 1955, Harold Kuhn invent the first version primal-dual Hungarian method for solving assignment problems. This was revised by James Munkres in 1957, and has been known since as the Hungarian algorithm, the Munkres assignment algorithm, or the Kuhn-Munkres algorithm.After that, Primal and Dual method inspired many researchers in combinatorial optimization.
The main feature of the primal-dual method is that it allows a weighted optimization problem to be reduce to a purely combinatorial, unweighted problem. Most of the fundamental algorithms in combinatorial optimization either use the method or can be understand in terms of it, including the Dijkstra's shortest path algorithm [Dij59], ford and Fulkerson's network flow alforithm[FF56], Edmond's non-bipartite matching algorithm[Edm65], and of course Kuhn's assignment algorithm.
Actually we could take the dual program of assignment problem as min. weight perfect matching problem on bipartite graph.The idea of LP-Dual $\mathit{D}$ is that we could estimate the solution of LP-Primal $\mathit{P}$, i.e., one could try to match the linear combination of constraints to the objective function, therefore we would get the lower bound $\sum^m_j y_jb_j$ by the linear combination of constraints. Furthermore we could extend this idea.
We have the following theorem \THM{{\bf Weak Duality} For any primal-dual solution pair $(x,y)$, $\VEC{c}^T\VEC{x}\geq\VEC{y}^T\VEC{b}$.}
It is obviously that solution of $\mathit{D}$ is increasing and solution of $\mathit{P}$ is decreasing; when they met, we have \THM{{\bf Strong Duality} The optimal primal solution will equal to dual solution, i.e., $\VEC{c}^T\VEC{x}=\VEC{y}^T\VEC{b}$. }
\DEF{We define $\VEC{c}^T\VEC{x}-\VEC{y}^T\VEC{b}$ the {\bf duality gap}.}
\DEF{We define $\sup_I\frac{OPT_{IP}(I)}{OPT_{LP}(I)}$ the {\bf integral gap}.}
Primal-Dual Schema om Approximation Algorithms
-
Formulate Dual-LP,
Construct the feasible solution of Dual-LP,
Make the feasible solution tight combinatorially,
Keep the corresponding solution of Primal-LP feasible.\newline
References
-
The Primal-Dual Schema for Approximation Algorithms: Where Does It Stand, and Where Can It Go?, Vijay Vazirani, Georgia Institute of Technology (USA), Algorithms Seminar, December 11, 2000.
Introduction to Linear Optimization, by Dimitris Bertsimas and John N. Tsitsiklis.
Chapter 7: Provides a comprehensive review of the principal results and methods for the different variants of the network flow problem. It contains representatives from all major types of algorithms: primal descent (the simplex method), dual ascent (the primal-dual method), and approximate dual ascent (the auction algorithm). The focus is on the major algorithmic ideas, rather than on the refinements that can lead to better complexity estimates.
Vazirani, V.V., ``Primal-dual schema based approximation algorithms,'' Theoretical aspects of computer science: advanced lectures, Springer-Verlag New York, Inc., 2002, pp. 198-207. Williamson, D.P., The primal-dual method for approximation algorithms, Mathematical Programming, 2002, 91, 447-478.2006-09-18 03:03:18
Worse or Better
I got the TA position of ``Computer Systems & Applications,'' but I didn't pass the entrance exam of ``academic writing.'' However I need to report the Lovasz Local Lemma on``Special Topics in Algorithms.''
I need to read a survey about property testing and get an interview with my adviser, and build a server for ``Graph algorithms meeting.''
Nevertheless the worst thing might be the project that raising the standard of graduation. Master program student might publish a conference paper for graduating.
And there are something still interesting me, i.e., fractional graph theory (fractional dominating), topological method, LP primal and dual method.
2006-09-18 00:24:23
Fate and Plan
I remember that I would like to be a business man at junior high school, physicist at senior high school, EE engineer at college, and I changed my mind to apply master program of computer science.
At that year I prepared the entrance exam of graduate school, I found that algorithmics and combinatorics interested me more than operating systems and computer architecture on which I originally wanted to research.
However, I decided to join the Algorithm and Biocomputing Laboratory. The first reason was that Prof. Tang's professional field is algorithmics, and I thought that the algorithmics is the most important training which I need. One year passed, I think I have a little bit precipitation.
I always want to learn many things but I don't really spend enough time on those.
Now I am considering to plan my life. At least I am trying to decide to pursue a Ph.D. or not, but I also know that it is not that useful.
2006-09-17 17:47:49
Numb3rs
I currently enjoyed in those American television show about FBI, such as CSI, Criminal Mind. And I like NUMB3RS mostly.
The ep. 2 of season 1 introduced the NP vs. P problem....:)
The mathematic an and physicist are cool.
I had dreamed to be a scientist solving people's problems by math and scientific analysis just like complex scientist....
Ref.: NUMB3RS From Wikipedia, the free encyclopedia
2006-09-16 18:17:56
Zero Knowledge and the Chromatic Number
We present a new technique, inspired by zero-knowledge proof systems, for proving lower bounds on approximating the chromatic number of a graph. To illustrate this technique we present simple reductions from max-3-coloring and max-3-sat, showing that it is hard to approximate the chromatic number within[Omega](N[delta]) for some[delta]>0. We then apply our technique in conjunction with the probabilistically checkable proofs of Hastad and show that it is hard to approximate the chromatic number to within[Omega](N1-[var epsilon]) for any[var epsilon]>0, assuming NP[nsube]ZPP. Here, ZPP denotes the class of languages decidable by a random expected polynomial-time algorithm that makes no errors. Our result matches (up to low order terms) the known gap for approximating the size of the largest independent set. Previous O(N[delta]) gaps for approximating the chromatic number (such as those by Lund and Yannakakis, and by Furer) did not match the gap for independent set nor extend beyond[Omega](N1/2-[var epsilon]).
I am interested in this paper because I want to know how to connect the zero-knowledge proof systems to chromatic number problem. It seems without any relation between such two topics intuitively.Actually this paper is not the newest result on this problem, to best of my knowledge, the paper ``Linear degree extractors and the inapproximability of max clique and chromatic number'' by David Zuckerman on STOC2006 should be the newest result. But this paper appeal my attention first by his interesting title.
Free Bits
In the section 2, the authors summarized the previous works on this problem. They mentioned the idea ``free bit'' that are proposed by M. Bellare and M. Sudan's paper ``Improved Non-Approximability Results.''
It might useful to refer prof. Sudan's lecture note, but it is still hard to understand such concept. Luca Trevisan's paper ``Inapproximability of Combinatorial Optimization Problems'' explained this concept better, i.e.,
We say that a (r(n), q(n))-restricted verifier V for a language L has ``free bit complexity'' at most f(n) if, for every input x and every choice of the random input, there are at most $2^{f(n)}$ possible answers to the q(n) queries made by the verifier that would make it accept.
This implies that the important parameter to optimize is not the number of queries but rather the number of accepting computations.According to M. Bellare, O. Goldreich, and M. Sudan. ``Free bits, PCP's and nonapproximability - towards tight results.'' SIAM Journal on Computing, 27(3):804--915, 1998.
The free-bit complexity, roughly speaking, is the logarithm of number of possible accepting configurations of V on coins R and input x. (For example, a verifier which makes three queries and accepts, iff parity of the answers is odd, has four accepting configurations and thus free-bit complexity 2.)
I.e., {001,100,010,111} four accepting configurations. However the amortized free-bit complexity is still not easy to understand.the amortized free-bit complexity is the free-bit complexity (of a proof system with perfect completeness) divided by the logarithm of the gap (i.e., the number of free bits needed per factor of 2 increases in the gap).
where the gap g is defined as $g={completeness probability}/{soundness probability}$.The ``amortized query complexity'' is also an important parameter of the PCP verifier, although it is not mentioned here. There is a paragraph about it in Alex Samorodnitsky and Luca Trevisan's paper ``A PCP characterization of NP with optimal amortized query complexity'' ``Furthermore, one can show that, in a PCP characterization of NP, a verifier having error s must have query complexity at least log(l/s) (unless P.=NP) [5; 20]. So the best we can hope for (in terms of trade-off between error probability and number of queries) is to construct verifiers having error s and query complexity $\overline{q} \log(1/s)$, where $q>1$ is some constant that we would like to be as small as possible. The parameter $\overline{q}$ (the ratio between number of queries and logarithm of inverse error probability) is called the amortized query complexity of the verifier.''
Back to this paper, although it did not explain amortized free-bit complexity clearly, the author told us that amortized free-bit $f$ is the most relevant parameter for proving hardness of approximating $\alpha(G)$. I had confused the studies of PCPs. Since one had proved the PCP theorem, why do researchers still involve in the studies of some parameters such as query complexity? What the benefit is if we could decrease it significantly.
However, in the other hand, for improving the results of non-approximability, the different parameters might mean the explaining abilities, i.e., the study of different parameters of PCPs might be a kind of characterization of different intractable problems.
\redW{FGLSS Reduction}, proposed by Uriel Feige and Shafi Goldwasser and Laszlo Lovasz and Shmuel Safra and Mario Szegedy, ``Interactive proofs and the hardness of approximating cliques.''
Fractional Chromatic Number \redW{Fractional Chromatic number $X_f(G)$}, proposed by Lovasz ``On the ratio of optimal integral and fractional covers.''. \redW{k-tuple Chromatic Number}: Original definition from L. Lovasz, ``On the ratio of optimal integral and fractional covers''
\DEF{The {\bf k-tuple chromatic number} $X_k$ of $G$ is the least number $l$ such that it is possible to associate a $k$-subset $C(x)$ of $\{1,\ldots,l\}$ with every point $x$ of $G$ in such a way that $C(x)\cap C(y)=\varnothing$ for $(x,y)\in E(G)$. $X(G)=X_1(G)$ is the ordinary chromatic number.}
$X_k(G)$ could also be interpreted as the minimum number of independent sets which cover every point at least $k$ times.
The other definition proposed by the authors are in the inverse perspective. (However, it is not so lucid as the original one.)
\DEF{We define {\bf fractional chromatic number} $X_f(G)$ as the smallest real $k$ such that for some distribution $f$ on $G$'s independent sets, choosing $I\leftarrow f$ covers any $v\in V(G)$ with probability at least $1/k$.}
Ref.: Wikipedia: fractional chromatic number
Randomized Graph Products
The authors mentioned that there are two techniques. The first one is proposed in P. Berman and G. Schnitger, ``On the complexity of approximating the independent set problem'' and A. Blum, ``Algorithms for Approximate Graph Coloring.'' Another one is proposed in U. Feige, Randomized graph products, chromatic numbers, and the Lovasz -function.''
Randomized PCPs
It is convenient to view an RPCP as follows: A prover has an initial witness $W$ to the NP-statement and a source of randomness $R_P$. Based on $W$ and $R_P$ the prover generates a witness $w$ for the PCP verifier to check. If $x\in L$ then $w$ will have perfect completeness regardless of $R_P$. If $x\not\in L$, there is no legal witness $w$, and the verifier accepts with probability at most $e$.
The new parameter $\rho$ confused me again, however I think we could take the RPCP as a PCP system and Prover of it has the random bits, i.e., randomized prover.
Zero-Knowledge Proofs
I am studying Zero-Knowledge Proofs. Salil Vadhan wrote a good slide\footnote{http://www.cpi.caltech.edu/quantum-security/slides/vadhan.ppt} for this topic. The idea of zero knowledge is not hard to understand, but its formal definition isn't.
\DEF{$(P,V)$ is {\bf zero knowledge} if $\forall$ poly-time $V^{*}\exists$ poly-time $S$ s.t. when $x\in L, S(x)$ is computationally indistinguishable from $(P,V)(x)$.}
According to Oded Goldreich's tutorial, it means that there is a simulator $S$, that is a polynomial-time algorithm that generates false transcripts (without the prover) which are identical to the genuine, and an interactive proof has the zero knowledge property if a simulator exists for the proof. It means that \redW{The Verifier learns nothing in the interaction.}
I totally can't understand the statement. Why do they use the idea "simulator" to describe the idea zero knowledge? Usually, people use the term "simulator" to present a procedure that could work as simulated things, however the simulator must know something which Verifier don't know to simulate the whole work, since the Verifier doesn't interact with Prover.
I must get something wrong! but I don't know where it is! Maybe I need to read the original paper, Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof-systems. Proceedings of 17th Symposium on the Theory of Computation, Providence, Rhode Island. 1985. [Draft]
Zero-Knowledge Property
I had said that I am interested in the various complexities, so that I really want to know what it means although it might not useful to my research.
The original idea came from the paper, Shafi Goldwasser, Silvio Micali, and Charles Rackoff. ``The knowledge complexity of interactive proof-systems.'' Proceedings of 17th Symposium on the Theory of Computation, Providence, Rhode Island. 1985. In this paper the authors defined a measurement, knowledge complexity, of the amount of knowledge about the proof transferred from the prover to the verifier. and convinced it is an important parameter of proof systems.
It is great! It is not easy to formalize an informal and fuzzy idea, but they did it!
A ZK proof system is interactive proof system $+$ zero-knowledge property. The later one is the hardest part. A slides\footnote{http://www.cs.biu.ac.il/~porately/2001/algorithm/zk1.ppt} gave proofs for G3C, GI, respectively. Actually, Boaz Barak and Silvio Micali gave their lecture note and another on this topic respectively.
I thought Graph-3-Coloring is a earier one to understand, but it seems not so easy to me.
\DEF{$(P,V)$ is {\bf zero knowledge} if $\forall$ poly-time $V^{*}\exists$ poly-time $S$ s.t. when $x\in L, S(x)$ is computationally indistinguishable from $(P,V)(x)$.}
I had said that I totally can't understand the zero knowledge property, however if change the aspect of view, the probabilistic polynomial time simulator means simulate the ``interactive procedure'', i.e., That the interactive procedure could be simulated in probabilistic polynomial time means there didn't exist something intractable in the interactive part of original Interactive Proof System.
In the example of Graph-3-Coloring, the simulator will repeat forever until the verifier choose the pre-determined edge, but there exist only $m$ edges, i.e., the probability of the pre-determined edge doesn't be choosing for $m$ time run is $\frac{1}{m}^m$. Although in this example the verifier seems a fool who can't awake to the multiple time repeat.
The second question is that if it is computationally indistinguishable?
Piotr Berman and Georg Schnitger, ``On the complexity of approximating the independent set problem'', Information and Computation, Volume 96, Issue 1, , January 1992, Pages 77-94.
Update 2006-10-05
Prof. Madhu Sudan gave a course ``Approximability of Optimization Problems'' at MIT. The Lecture 23 at 1999-12-1 was ``Tailormade PCPs: Zero-knowledge and the chromatic number,'' that discussed the custom-made PCP and use this paper as example. He explained the ZK in here.
Of course, here we don't care whether the verifier learns anything or not, all we want is for the FGLSS graph G to have chromatic number X(G) equal to the number of valid probabilitistically checkable proofs of a given theorwm; and if the theorem does not have a valid proof, then the chromatic number of the FGLSS graph cannot be small.
This paper is really hard to understand. I try to understand it by converting it back to normal PCP (opposite to Randomized PCP), but prof. Sudan followed the authors' direction.2006-09-08 16:00:36
Manuel Blum and His Students
But I don't know Manuel Blum is also an outstanding teacher!
(Click the picture for larger size.)
OH! I think Manuel Blum is a Mr. CS indeed.
P.S. I drawn this graph by Ajax/Graphviz, and the data are from The Mathematics Genealogy Project.
2006-09-07 23:23:54
The New Semester
I am considering several things. My senior, a Ph.D. student, will teach the course ``Computer Systems & Applications,'' and I would like to apply the position of TA, however, I project to attend four courses, computer architecture, Optimization Theory, theory of computation, and academic writing. Besides I need to write my thesis and to attend ``Special Topics in Algorithms'' and ``Graph Algorithms Meeting.'' It is really heavy!
The worse thing is that my adviser seems interested in property testing recently, therefore it would become my topic of thesis, but I know less about it and I don't think that I have time to learn it....:(
2006-09-07 15:01:16
Theory of Computation
Computational Complexity:
-
CalTech CS 151: Complexity Theory (Spring 2005), Instructor: Chris Umans
Harvard Interactive Proofs & Zero-Knowledge Proofs, Salil P. Vadhan
CMSC 652 --- Complexity Theory, Fall 2005, Jonathan Katz
Brown U., CS159: Introduction to Computational Complexity, Prof. John E. Savage.
UIUC CS497 Concrete Models of Computation Spring 2003, Jeff Erickson
Princeton University, CS522 Computational Complexity, Boaz Barak
Oded Goldreich, Complexity Theory - Material
Drafts of books
-
Computational Complexity: A Conceptual Perspective draft of a book by Oded Goldreich
Computational Complexity: A Modern Approach, a draft of a book by Arora
BOOK IN PROGRESS WITH PHUONG NGUYEN, Foundations of Proof Complexity: Bounded Arithmetic and Propositional Translations, Stephen A. Cook.
Computability Theory:
-
COMP 006, Spring 2004, COMPUTABILITY, UNSOLVABILITY, and CONSCIOUSNESS, David A. Plaisted
Real Computation:
-
Ker-I Ko, Professor of Computer Science, State University of New York at Stony Brook. Not much info on his homepage. Some survey papers.
Vasco Brattka. There are some informations on this field and tutorial. The following figure is from his tutorial.
Lenore Blum, professor of CMU.
Probabilistically Checkable Proofs Sysyems:
-
Buffalo, CSE 713 - Probabilistically Checkable Proofs and Inapproximability, Fall 2004, Dr. Hung Q. Ngo.
Course on Probabilistically Checkable Proofs and Hardness of Approximation, March 2003 - June 2003, by Uri Feige
Georgia Inst. of Technology, College of Computing, CS 8002 Probabilistically Checkable Proofs and Hardness of Approximation, Subhash Khot
Berkeley, CS 294 | PCP and Hardness of Approximation | Spring 06, Luca Trevisan
M. Sudan, Probabilistic Checking of Proofs (PCPs), Outline of a 2-module course to be taught at the Scuola Matematica Interuniversitaria. July 3-15, 2005 & July 18-29, 2005.
Bibliographies and Links Collection:
2006-09-06 04:04:39
Optimization Theory
A branch of mathematics which encompasses many diverse areas of minimization and optimization. Optimization theory is the more modern term for operations research. Optimization theory includes the calculus of variations, control theory, convex optimization theory, decision theory, game theory, linear programming, Markov chains, network analysis, optimization theory, queuing systems, etc.
Ref. :Optimization (mathematics) - Wikipedia, the free encyclopediaRef. :Operations research - Wikipedia, the free encyclopedia
Since I decided to attend the course Optimization Theory of NCTU, I would write a note for collecting some resources.
Lecture Notes:
-
LSE MA208: Optimisation Theory 2005/6,
CalTech ACM 113 Introduction to Optimization
A Tutorial on Integer Programming
A Tutorial on Dynamic Programming, Michael A. Trick
A Tutorial on Stochastic Dynamic Programming, Michael A. Trick
EE364: Convex Optimization, Stephen Boyd, Stanford University, Spring quarter 2005-06.
EE392o: Optimization Projects, Stephen Boyd, Stanford University, Autumn Quarter 2003-2004
EE236B: Nonlinear Programming, UCLA Electrical Engineering Department, Prof. Lieven Vandenberghe
EE236A: Linear Programming, UCLA Electrical Engineering Department, Prof. Lieven Vandenberghe, Fall Quarter 2005-2006
Lehigh University IE495 -- Stochastic Programming, by Prof. Jeff Linderoth. [IE418 -- Integer Programming], [IE417 -- Nonlinear Programming]
Stochastic Programming Fall 2002, John E. Mitchell
Topics in Operations Research--Semidefinite Programming: Lecture Notes, Spring 1995, Farid Alizadeh
Standford MS&E 318/CME 338 Spring 2006, Large-Scale Numerical Optimization.
-
MIT OpenCourseWare | Sloan School of Management contains a lot of courses about optimization theory, such as Introduction to Optimization, Systems Optimization, Network Optimization, Readings in Optimization.
Optimization, Spring 2006, Peter Bro Miltersen.
Graduate course: "Complexity Transitions in Optimization Problems", Helsinki U. of Tech., Fall 2003
PSU CSE/MATH 555, Numerical Optimization, Hongyuan Zha
People
-
Ted Ralphs, an associate professor of dept. of Industrial and Systems Engineering, Lehigh University. He maintains many nice stuff on his homepage, including lecture notes and resources of research.
MIT, Dimitri Bertsekas. Author of many books and teach many courses.
Katta G. Murty, Professor, industrial and Operations engineering, The University of Michigan, Ann Arbor.
Monique Laurent, CWI
Alexander Schrijver(Lex Schrijver), CWI
Yinyu Ye, Stanford. He taught many courses like Linear and Conic Optimization, Linear and Nonlinear Optimization, etc.
Others
-
Stochastic Programming Community
semidefinite community
Stochastic Programming Bibliography.
Papers on semidefinite Programming, Farid Alizadeh
External Sources of Numerical Optimization Information, Numerical Analysis Group, Computational Science & Engineering Department, CCLRC.
2006-11-16 Update: In the course, Macroeconomic Analysis I, I heard the term ``Dynamic Optimization''.
I knew the Optimization Theory contains a lot of sub-fields like combinatorial optimization, Stochastic programming, however I didn't know this.
Ref. Dynamic optimization problems:
Dynamic optimization problems involve dynamic variables whose values change in time. Such problems exist in the areas of optimal control, parameter estimation for dynamic models, reactor network synthesis where the dynamic models arise from the differential modeling of the tubular reactors (plug flow reactors), and for dynamic simulation and optimization.
Some links put here-
MIT OCW 14.128 Dynamic Optimization & Economic Applications (Recursive Methods), Spring 2003
Texas A&M University. Agec 637 Production Economics and Dynamic Optimization, Prof. Woodward.
Course 02711: Static and Dynamic Optimization
2006-09-05 22:24:01
Integrality and the Unimodular Property
Integrality and the Unimodular Property
We knew that for some network problems, there are combinatorial algorithms and linear programming algorithms. It means the solutions of those problem are integers, i.e., integrality of network problems, and generally the integer linear programming is in NP-Complete. Thus I want to know the criteria of integrality and tractability.
Ref. : Chapter 4 NETWORK FLOWS, section 12 Integrality of Flows and the Unimodular Property, Lawler, Eugene L. Combinatorial Optimization. Holt, Rinehart, and Winston, 1976.
Ref. : Chapter 8. Integer linear programming and totally unimodular matrices, A Course in Combinatorial Optimization, Alexander Schrijver, February 1, 2006.
Theorem 2.2. Let $P=\{x \| Ax\leq b\}$ be a polyhedron in $R^n$ and let $z\in P$. Then $z$ is a vertex of $P$, if and only if $rank(A_z) = n$.}
where $A_z$ is the submatrix of $A$ consisting of those rows $a_i$ of $A$ for which $a_i\times z= b_i$. $A_z$ is the submatrix of A consisting of those rows $a_i$ of $A$ for which $a_i\times z=b_i$.A matrix A is called totally unimodular if each square submatrix of A has determinant equal to 0, +1, or -1. In particular, each entry of a totally unimodular matrix is 0, +1, or -1.
Theorem 8.1. Let A be a totally unimodular $m\times n$ matrix and let $b\in Z^m$. Then each vertex of the polyhedron
(9) $P:=\{x \| Ax\leq b\}$
is an integer vector.
Theorem 8.2. Let A be an integer $m\times n$ matrix of rank $m$. Then A is unimodular, if and only if for each integer vector $b$ the polyhedron
(18) $P=\{x \| x>0; Ax = b\}$
is integer.
Theorem 12.1 (Integral Circulation Theorem) If all lower bounds and capacities are integers and there exists a finite optimal circulation, then there exists an integral optimal circulation (whether or not arc costs are integers).
Corollary 12.3 Let $C'$ be the convex polytope defined by the inequality constraints $A'x\leq b, x \geq 0$, where $A'$ is an integer matrix. The following three conditions are equivalent:
(12.2) $A'$ is totally unimodular.
(12.3) The extreme points o f$C'$ are all integral for any integer vector $b$.
(12.4) Every nonsingular submatrix of$A'$ has an integer inverse.
Unfortunately, there do not seem to be any easily tested necessary and sufficient conditions for total unimodularity. Perhaps the most elegant such conditions are due to Camion, which we state without proof. A matrix is said to be Eulerian if the sum of the elements in each row and in each column is even.
Theorem 12.4 (Cumion) A (0, +l, -1) matrix is totally unimodular if and only if the sum of the elements in each Eulerian square submatrix is a multiple of four.
Theorem 12.5 .4 (0, + 1, -1) matrix A is totally unimodular if both of the following conditions are satisfied:
(12.5) Each column contains at most two nonzero elements.
(12.6) The rows of A can be partitioned into two sets A, and $A_2$, such that two nonzero entries in a column are in the same set of rows if they have different signs and in different sets of rows if they have the same sign.
2006-09-05 05:57:13
Move some old stuff to here
I moved this blog to my Labˇ¦s server several months ago. Now I think the new one works well and satisfy my needs so that I announce it.
Its blog system is nanoblogger
2006-09-03 02:49:42
Property Testing
Thus it is the reason I create this note....:)
We consider the question of determining whether a given object has a predetermined property or is ``far'' from any object having the property. Specifically, objects are modeled by functions, and distance between functions is measured as the fraction of the domain on which the functions differ.
From Combinatorial Property Testing, Oded GoldreichAt first, I would like to wirte down some URLs of articles not so formal.
-
Additive combinatorics (by Ryan O'Donnell)
Szemerdi's Regularity Lemma, used nonstop in the area of property testing, was developed originally for "Szemerdi's Theorem" from additive combinatorics. Ben Green also has a version of the regularity lemma for boolean functions, which I used in a paper on hardness of approximation for "Grothendieck problems".
There are a lot of articles on Luca Trevisan's blog, i.e., series articles of Szemeredi's theorem [1] [2] [3] [4] [5], Gowers uniformity, The Szemeredi Regularity Lemma, Pseudorandomness and more pseudorandomness. Property Testing and its Connection to Learning and Approximation Holographic Proofs, by Leonid A. LevinI used the term property testing in google for searching Taiwaness researchers, there are only two guys, one is the student of prof. Lyuu, the other one is Chuang-Chieh Lin.
-
Chen, Hsuan-Chang. Testing Connectivity of Graphs. Master's thesis.(prof. Lyuu's student)
Wang, Yi-Wen. Approximate Algorithms on Checking Proofs with Probabilistic Methods. Master's thesis.(prof. Lyuu's student)
A variant of the hypergraph removal lemma, Terence Tao, Journal of Combinatorial Theory, Series A, Volume 113, Issue 7 , October 2006, Pages 1257-1280
Connectivity Properties of Matroids, Milena Mihail and Madhu Sudan, 1992, Technical Report: CSD-92-662
Property testing and its connection to learning and approximation, Oded Goldreich and Shari Goldwasser and Dana Ron, JACM, 1998.
Gowers uniformity, influence of variables, and PCPs, Alex Samorodnitsky and Luca Trevisan, STOC '06
Johan Hastad, Avi Wigderson, Simple analysis of graph tests for linearity and PCP, Random Structures and Algorithms, vol. 22, no. 2, pp. 139-160, 2003.
Fast approximate PCPs, Funda Ergun and Ravi Kumar and Ronitt Rubinfeld, STOC'99
People:
-
Ronitt Rubinfeld, Professor of Electrical Engineering and Computer Science at MIT,
Prof. Dana Ron, Department of Electrical Engineering - Systems, Tel-Aviv University
Eldar Fischer, Computer Science Department, Technion - Israel Institute of Technology
Some other Bibliographies [1], [2]
Some paper I read:
-
Kumar, R., Rubinfeld, R., ``Sublinear time algorithms'', Samir Khuller's Algorithms column in SIGACT News 2003.
It introduced the sublinear algorithms. It introduced the traditional sublinear algorithms like k-median and linear programming.
The Art of Uniformed Decisions.
After reading this paper, I am interested in non-testable. I think Property Testing could be taken as the approximation of the optimization problems which are transformed(or re-formulated) from decision problems like Zuckerman's ``NP-complete problems have a version that's hard to approximate.'' Structure in Complexity Theory Conference, 1993., Proceedings of the Eighth Annual, 1993, 305-312.
Therefore the distance parameter (epsilon-far, -close) is the approximation ratio of problem in this view.
Testing Connectivity of Graphs. Property Testing :A Tutorial, Dana Ron, Appeared: Handbook on Randomization, volume II, Edited by: S. Rajasekaran, P. M. Pardalos, J. H. Reif, and J. D. P. Rolim, 39 pages.
Half pages of this paper are about testing graph properties. E. Fischer's survey is more general but harder to read. (1st order and rho clique)
[Unread]Property testing in bounded degree graphs, O. Goldreich and D. Ron. Proc. 28th Symposium on Theory of Computing, pp. 406--415, 1997.
I wanted to read this paper, because a paper I had read cite it.. However this paper lacked too much details, I can't understand it. Though I read the journal version, Algorithmica, Vol. 32 (2), pages 302-343, 2002. (Ohmm...42 pages) I might read the part about connectiveity testing. [Unread] WANG, Y. Approximate Algorithms on Checking Proofs with Probabilistic Methods, master thesis, Dept. CS, NTU, 2000. [Unread] Funda Ergun , Ravi Kumar and Ronitt Rubinfeld, Fast approximate probabilistically checkable proofs, Information and Computation, Volume 189, Issue 2, , 15 March 2004, Pages 135-159. [Unread] Shapira, A. A combinatorial characterization of the testable graph properties: it's all about regularity STOC '06: Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, ACM Press, 2006, 251-260