Originators: I. Shparlinski (presented by J. Cooper - REGS 2009)
We combine the 2009 presentation with background and another problem from a REGS presentation in 2006 by Stephen Hartke.
Definitions: A De Bruijn cycle of order k is a cyclic string of length n over an alphabet of size s such that the strings of k consecutive letters are distinct. We say that a de Bruijn cycle of order k over S is full when it has length s.
Background: De Bruijn [dB, 1946] proved that such cycles exist when s=2 and n=2k. This was also proved by Good [G, 1946] and by Flye Sainte-Marie [F, 1894], but De Bruijn's name stuck even after he later acknowledged priority by Flye Sainte-Marie. The result generalizes easily for all s; cycles exist whenever k≤n≤sk.
For s=2, one can ask how "balanced" a full de Bruijn cycle is between 0s and 1s. Starting from a given point in the cycle S, the discrepancy D(S) is the maximum difference between the number of 0s and number of 1s in any initial portion of the cycle. This difference can be written as ∑j=0t(-1)a(j), where a(j) denotes the jth entry in the cycle. Note that D(S) can be generalized to alphabets of size s by replacing -1 with an sth root of unity. Shparlinski proved for s=2 that every cycle has discrepancy at most O(2n/√n).
Problem 1: Find a full de Bruijn cycle with discrepancy Ω(2n/√n).
Motivation: Another problem arises from an application. De Bruijn sequences have attracted attention from biologists for their use in sequencing genomes. A strand of DNA is chopped into pieces of size k and placed in a liquid containing an indicator strand. In the indicator strand, each k-string appears at most once, so the test strand can only bind to one place along the indicator strand. To keep the indicator from folding and binding to itself, we seek De Bruijn sequences where also no k-string is the reversed complement of another. Say that a k-string is equivalent to its reversed complement. (Complementation is defined by any matching on |S|; the standard example is the pair {0,1} when |S|=2.)
Question 2: What lengths of De Bruijn cycles exist such that the strings captured by windows of length k all lie in distinct equivalence classes under reverse complementation?
Comments: Stephen Hartke reports that Question 2 was answered fairly thoroughly by Anderson, Fox, and Niblo [AFN]. Question 2 has been considered for other equivalence relations. For s=2 and equivalence under complementation, De Bruijn cycles exist when 4≤n≤2^k and n≠2^k-2 (Hartke [H]). For s=2 and for equivalence under reversal, only bounds on the range are known (Hartke [H]), but these are asymptotically of the same order. Perhaps the methods from Hartke [H] and Dai-Martin-Robshaw-Wild [DMRW] can be combined to produce De Bruijn cycles that are distinct under simultaneious reversal and complementation.
References:
[AFN] Anderson, James W.; Fox, Keith R.; Niblo, Graham A.:
A fast algorithm for the construction of universal footprinting templates in
DNA.
J. Math. Biol. 52 (2006), no. 3, 307--342.
[BMRW] Dai, Z.-D.; Martin, K. M.; Robshaw, M. J. B.; Wild, P. R.:
Orientable sequences. Cryptography and coding, III (Cirencester, 1991),
97--115, Inst. Math. Appl. Conf. Ser. New Ser., 45,
Oxford Univ. Press, New York, 1993.
[dB] de Bruijn, N. G.
A combinatorial problem.
Nederl. Akad. Wetensch., Proc. 49, (1946) 758--764,
Indagationes Math. 8, 461--467 (1946).
[F] C. Flye Sainte-Marie,
Solution to question nr. 48,
Interm\'ediaire des Math\'ematiciens 1 (1894), 107-110.
[G] Good, I. J.
Normal recurring decimals.
J. London Math. Soc. 21, (1946). 167--169.
[H] Hartke, Stephen G.
Binary de Bruijn cycles under different equivalence relations.
Discrete Math. 215 (2000), no. 1-3, 93--102.