Welcome to lifeintel.org

Here, you can download the source code for SigOli. The latest version is sigoli 1.1 (download).

SigOli is the result of a number of projects executed between 2001 and 2005 by LifeIntel Software Inc.
SigOli is now distributed under the GNU General Public License (GPL), which makes it free software.
Make sure you understand the license terms before using SigOli.

How to build SigOli:

-you need a version of UNIX or Windows with Cygwin
-the software compiles with some warnings and has been tested on a few flavours of Linux, Solaris and Cygwin
-you need flex and bison installed on your machine (they come included in most distributions; if they are not included in yours, follow the links)
-download the latest version of SigOli (here)
-extract all files from the archive in an empty directory (run tar -xzf sigoli-XX.tar.gz)
-run make in the directory where you extracted the files
-if all works well, the SigOli binary can be found in the subdirectory "b"
-a copy of the GPL License can be found in the subdirectory "l"
-find contact information in the source code; all comments welcome

How to use SigOli:

- sigoli expects as input sequences in the FASTA format
- for a new search, start with an empty directory (the search directory)
- store every sequence as a file in the search directory
- store every group of sequences as files in a group directory in the search directory
- run sigoli with the search directory as a parameter (command line parameters below)
- archive file with a worked example

Command line options: [option-name=option-value]*

-operation=(operation-name); supported operations:
--strings -- writes to the output all oligo strings from all sequences and all groups
--positions -- generates an input file for Array Designer (tab-separated list of oligo sites)
--ranges -- writes a list of all ranges of oligos from each sequence and each group
--ambig -- writes a list of all ambiguous subsequences that have been discarded because of more ambiguities than max-unambiguous-count

-sequence-directory=(relative-directory-name); the location of the directory name where the sequences and directories to be analysed are located
-oligo-size=(oligo-size); will set the size of oligos to be discovered; default=16
-ambiguous=(yes/no); (obsolete) if yes, ambiguous subsequences may be considered oligos
-diff=(yes/no); indicates whether small differences (1 nucleotide) are considered
-crowded=(yes/no); indicates whether for the ranges and positions operations, an oligo range is populated with intermediary sites
-stop-on-error=(yes/no); indicates whether the system will stop when encountering an invalid sequence file; default=no
-first-site-gap=(gap-size); for a crowded display, indicates the size of the gap between the border of the range and the first interior site
-inter-site-gap=(gap-size); for a crowded display, indicates the size of the gap between sites inside an oligo range
-max-unambiguous-count=(count); indicates the maximum number of unambiguous sequences that will be considered in a disambiguation

Bibliography

- Zahariev M, Dahl V, Chen W, Lévesque CA, 2009. Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases. Molecular Ecology Resources 9(s1), 58-64. abstract
- Chen W, Seifert KA, Lévesque CA, 2009. A high density COX1 barcode oligonucleotide array for identification and detection of species of Penicillium subgenus Penicillium. Molecular Ecology Resources 9(s1), 114-129. abstract