fastq

Strategies for generating FASTQ formatted sequence and quality data.

fastq_quality()

@composite
def fastq_quality(draw, min_size: int = 0, max_size: Optional[int] = None, min_score: int = 0, max_score: int = 93, offset: int = 33) -> str

Generates quality strings for the FASTQ format.

Arguments

  • min_size: Minimum length of the quality string.
  • max_size: Maximum length of the quality string.
  • min_score: Lowest quality (PHRED) score to use.
  • max_score: Highest quality (PHRED) score to use.
  • offset: ASCII encoding offset.

Note

The default quality string is 'fastq-sanger' format. If you would like 'fastq-illumina' then set offset to 64 and max_score to 62. If you would like fastq-solexa then set offset to 64, min_score to -5 and max_score to 62. See https://academic.oup.com/nar/article/38/6/1767/3112533 for more details on the FASTQ format (and its quality score encoding).

fastq_entry()

@composite
def fastq_entry(draw, min_size: int = 0, max_size: Optional[int] = None, min_score: int = 0, max_score: int = 93, offset: int = 33, sequence_source: Optional[SearchStrategy] = None, identifier_source: Optional[SearchStrategy] = None, additional_description: bool = True, wrap_length: int = 80) -> str

Generates entries in FASTQ format.

Arguments

  • min_size: Minimum length of the sequence and quality string.
  • max_size: Maximum length of the sequence and quality string.
  • min_score: Lowest quality (PHRED) score to use.
  • max_score: Highest quality (PHRED) score to use.
  • offset: ASCII encoding offset for quality string.
  • sequence_source: Search strategy to generate the sequence from. By default dna() will be used.
  • identifier_source: Search strategy to generate the sequence identifier from. If None then random text will be generated.
  • additional_description: Add sequence ID and comment after + on third line.
  • wrap_length: Number of characters to wrap the sequence and quality strings on. Set to 0 to disable wrapping.

Note

The default quality string is 'fastq-sanger' format. If you would like 'fastq-illumina' then set offset to 64 and max_score to 62. If you would like fastq-solexa then set offset to 64, min_score to -5 and max_score to 62. See https://academic.oup.com/nar/article/38/6/1767/3112533 for more details on the FASTQ format (and its quality score encoding).

fastq()

@composite
def fastq(draw, entry_source: Optional[SearchStrategy] = None, min_reads: int = 1, max_reads: int = 100) -> str

Generates string representations of FASTQ files.

Arguments

  • entry_source: The search strategy to use for generating FASTQ entries. The default (None) will use fastq_entry with default settings.
  • min_reads: Minimum number of FASTQ entries to generate.
  • max_reads: Maximum number of FASTQ entries to generate.