Manual
GapsMis is a tool for pairwise sequence alignment with a bounded number of gaps. It focuses on semi-global alignment and it restricts the alignment in containing a variable, but bounded, number of gaps.
This is how it computes a penalty for a gap of n positions :
gap opening penalty + (n - 1) * gap extension penalty
Usage
Usage: gapsmis <options> Standard (Mandatory): -a, --sequence-a <str> Sequence A filename. -b, --sequence-b <str> Sequence B filename. Optional: -g, --gap-open-penalty <float> The gap open penalty is the score taken away when a gap is created. The best value depends on the choice of comparison matrix. The default value assumes you are using the EBLOSUM62 matrix for protein sequences, and the EDNAFULL matrix for nucleotide sequences. Floating point number from 1.0 to 100.0. (default: 10.0) -e, --gap-extend-penalty <float> The gap extension penalty is added to the standard gap penalty for each base or residue in the gap. This is how long gaps are penalized. Floating point number from 0.0 to 10.0. (default: 0.5) -o, --output-file <str> Output alignment filename. (default: gapsmis.out) -d, --data-file <str> This is the scoring matrix used when comparing sequences. It can be either `EBLOSUM62' (for protein sequences) or `EDNAFULL' (for nucleotide sequences). (default: EDNAFULL) -l, --max-num-gaps <int> Limit the maximum number of allowed gaps to this value. (default: 2) -m, --max-gap <int> Limit the maximum gap size to this value. (default: length of the longest sequence minus 1)
GapMis is a tool for pairwise sequence alignment with a single gap. It focuses on semi-global alignment and it restricts the alignment in containing at most one gap.
This is how it computes a penalty for a gap of n positions :
gap opening penalty + (n - 1) * gap extension penalty
Usage
Usage: gapmis [options] Standard (Mandatory): -a, --sequence-a <str> Sequence A filename. -b, --sequence-b <str> Sequence B filename. Optional: -g, --gap-open-penalty <float> The gap open penalty is the score taken away when a gap is created. The best value depends on the choice of comparison matrix. The default value assumes you are using the EBLOSUM62 matrix for protein sequences, and the EDNAFULL matrix for nucleotide sequences. Floating point number from 1.0 to 100.0. (default: 10.0) -e, --gap-extend-penalty <float> The gap extension penalty is added to the standard gap penalty for each base or residue in the gap. This is how long gaps are penalized. Floating point number from 0.0 to 10.0. (default: 0.5) -o, --output-file <str> Output alignment filename. (default: gapsmis.out) -d, --data-file <str> This is the scoring matrix used when comparing sequences. It can be either `EBLOSUM62' (for protein sequences) or `EDNAFULL' (for nucleotide sequences). (default: EDNAFULL) -m, --max-gap <int> Limit the maximum gap size to this value (default: length of the longest sequence minus 1)
GapMis-OMP is the OpenMP-based version of GapMis. It is designed to compute the alignments between all the sequences in a first set of sequences and all those from a second one in parallel.
Usage
Usage: gapmisStandard (Mandatory): -a, --sequences-a <str> Query sequences of length m filename. -b, --sequences-b <str> Target sequences of length n>=m filename. Optional: -g, --gap-open-penalty <float> The gap open penalty is the score taken away when a gap is created. The best value depends on the choice of comparison matrix. The default value assumes you are using the EBLOSUM62 matrix protein sequences, and the EDNAFULL matrix for nucleotide sequences. Floating point number from 1.0 to 100.0. (default: 10.0) -e, --gap-extend-penalty <float> The gap extension penalty is added to the standard gap penalty for each base or residue in the gap. This is how long gaps are penalized. Floating point number from 0.0 to 10.0. (default: 0.5) -o, --output-file <str> Output alignment filename (default: gapmis.out) -f, --output-format <int> Output alignment format. 0 for outputting only the scores and 1 for EMBOSS-like output. (default: 0) -d, --data-file <str> This is the scoring matrix used when comparing sequences. It can be either `EBLOSUM62' (for protein sequences) or `EDNAFULL' (for nucleotide sequences). (default: EDNAFULL) -t, --threads <int> Number of threads to be used (default: 1) -m, --max-gap <int> Limit the maximum gap size to this value (default: length of the longest sequence minus 1)
libgapmis is an ultrafast library for pairwise short-read alignment, based on GapMis, including accelerated SSE-based and GPU-based versions.
Man page
NAME libgapmis - a library for pairwise short-read single-gap alignment SYNOPSIS #includeunsigned int gapmis_one_to_one ( const char *p, const char *t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_one_to_one_scr ( const char *p, const char *t, const struct gapmis_params *in, double *scr ); unsigned int gapmis_results_one_to_one ( const char *filename, const char *p, const char *p_header, const char *t, const char *t_header, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_one_to_many ( const char *p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_one_to_many_opt ( const char *p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_one_to_many_opt_sse ( const char *p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_one_to_many_opt_gpu ( const char *p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_results_one_to_many ( const char *filename, const char *p, const char *p_header, const char **t, const char **t_header, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_many_to_many ( const char **p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_many_to_many_opt ( const char **p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_many_to_many_opt_sse ( const char **p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_many_to_many_opt_gpu ( const char **p, const char **t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapmis_results_many_to_many ( const char *filename, const char **p, const char **p_header, const char **t, const char **t_header, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapsmis_one_to_one ( const char *p, const char *t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapsmis_one_to_one_scr ( const char *p, const char *t, const struct gapmis_params *in, double *scr ); unsigned int gapsmis_one_to_one_onf ( const char *p, const char *t, const struct gapmis_params *in, struct gapmis_align *out ); unsigned int gapsmis_results_one_to_one ( const char *filename, const char *p, const char *p_header, const char *t, const char *t_header, const struct gapmis_params *in, struct gapmis_align *out ); DESCRIPTION The functions in the libgapmis library compute the optimal semi-global single-gap alignment(s) between a string (or a set of strings) representing the pattern(s) and a string (or a set of strings) representing the text(s). The function gapmis_one_to_one() computes the optimal semi-global single-gap alignment between a single pattern string p and a text string t. The alignment is done according to the parameters specified by in and the resulting alignment is stored in out. The function gapmis_one_to_one_scr() computes only the score of the optimal semi-global single-gap alignment between a single pattern string p and a text string t. The alignment is done according to the parameters specified by in and the resulting score is stored in scr. The function gapmis_results_one_to_one() prints details of the alignment computed by function gapmis_one_to_one() in file filename. The function gapmis_one_to_many() computes the optimal semi-global singe-gap alignments between a single pattern string p and a set of text strings t. The alignment is done according to the parameters specified by in and the resulting alignments are stored in out. The function gapmis_one_to_many_opt() computes the optimal semi-global single-gap alignment with the maximum score between a single pattern string p and a set of text strings t. The alignment is done according to the parameters specified by in and the resulting alignments are stored in out. The accelerated SSE- and GPU-based versions of gapmis_one_to_many_opt() are gap‐ mis_one_to_many_opt_sse() and gapmis_one_to_many_opt_gpu(), respectively. The function gapmis_results_one_to_many() prints details of the alignments computed by function gapmis_one_to_many() in file filename. The function gapmis_many_to_many() computes the optimal semi-global single-gap alignments between a set of pattern strings p and a set of text strings t. The alignment is done according to the parameters specified by in and the resulting alignments are stored in out. The function gapmis_many_to_many_opt() computes the optimal semi-global single-gap alignments with the maximum score between each pattern from a set of pattern strings p and a set of text strings t. The alignment is done according to the parameters specified by in and the resulting alignments are stored in out. The accelerated SSE- and GPU-based versions of gap‐ mis_many_to_many_opt() are gapmis_many_to_many_opt_sse() and gapmis_many_to_many_opt_gpu(), respectively. The function gapmis_results_many_to_many() prints details of the alignments computed by function gapmis_many_to_many() in file filename. The function gapsmis_one_to_one() computes a semi-global alignment between of a single pattern string p and a text string t. By splitting the pattern into several fragments the user can iden‐ tify multiple gaps in the alignment---a single-gap alignment per fragment is performed. The alignment for each fragment is done according to the parameters specified by in, where the number of fragments can be specified, and the resulting alignment is stored in out. The function gapsmis_one_to_one_scr() computes only the total score of the semi-global alignment between a single pattern string p and a text string t. The alignment for each fragment is done according to the parameters specified by in and the resulting total score is stored in scr. The function gapsmis_one_to_one_onf() computes the optimal semi-global alignment between a number of fragments of a single pattern string p and a text string t by splitting the pattern into the optimal number of fragments. The optimal number of fragments is the number of fragments resulting in the higher score. The alignment for each fragment is done according to the parameters specified by in, where the maximum number of fragments can be specified, and the resulting alignment is stored in out. The function gapsmis_results_one_to_one() prints details of the alignments computed by function gapsmis_one_to_one() or by function gapsmis_one_to_one_onf() in file filename. The structure gapmis_params passed for the in argument is defined as: struct gapmis_params { unsigned int max_gap; unsigned int scoring_matrix; double gap_open_pen; double gap_extend_pen; unsigned int num_frags; } The maximum allowed length of gap is specified in max_gap, the scoring_matrix can be either EDNAFULL or EBLOSUM62. The gap open penalty and gap extension penalty are specified by gap_open_pen and gap_extend_pen, respectively. The number of fragments is specified by num_frags, and it is used only in functions gapsmis_one_to_one(), gapsmis_one_to_one_scr(), and gapsmis_one_to_one_onf(). In function gapsmis_one_to_one_onf() num_frags specifies the maximum number of fragments. The structure gapmis_align passed for the out argument is defined as: struct gapmis_align { double max_score; unsigned int min_gap; unsigned int where; unsigned int gap_pos; unsigned int num_mis; } The computed optimal alignment consists of the maximum score max_score, the length of the gap min_gap, the location of the gap where (0 if the gap occurs in the pattern, 1 if it occurs in the text), the starting position of the gap gap_pos, and the number of detected mismatches num_mis. In the case of single-gap alignment per fragment, an array out of gapmis_align structs is returned, each containing the starting position of the gap gap_pos and the number of detected mismatches num_mis relatively to the respective fragment.