Lutzomyia longipalpis (LlonJ1)

Lutzomyia longipalpis Assembly and Gene Annotation

The Lutzomyia longipalpis data and its display on Ensembl Genomes are made possible through a joint effort by the Ensembl Genomes group and VectorBase, a component of VEuPathDB.

About Lutzomyia longipalpis

The sand fly Lutzomyia longipalpis is distributed from Mexico to Argentina, including all the countries of Central America (except Belize) and most of tropical South America east of the Andes (except Guyana, Surinam and French Guiana). Across its distribution range is the major vector of American visceral leishmaniasis. Studies suggest that L. longipalpis may be a single heterogeneous species or a species complex.

Jacobina strain

Lutzomyia longipalpis (Lutz & Neiva, 1912) (Diptera: Psychodidae) is a new world sand fly. It was the first Lutzomyia species to be recognised as a vector of Leishmania in South and Central America. The female L. longipalpis transmits a protozoan parasite Leishmania infantum (also know as L. chagasi).

The strain of sand flies used for the sequencing were originally collected by Prof Richard Ward in the 1988 from Jacobina, Bahia State, Brazil. They were kept at the Liverpool School of Tropical Medicine before transfer to their present site based at Lancaster University in NW England.

Lutzomyia longipalpis is thought to be a species complex but the number within the complex and their relationships are unclear. The sibling species are thought to differ in their vectorial capacity and genome sequencing will help to identify the most important vector species. This knowledge is vital to understand the epidemiology and control of this neglected disease in South and Central America

Source: VectorBase

LlonJ1 assembly

Three types of WGS libraries were used to produce these Lutzomyia longipalpis sequencing data: a 454 Titanium fragment library and paired end libraries with 3 kb and 8 kb inserts. The 454 data (11.5 million reads; ~24.4x coverage) was derived from the same individual while mate pair reads (7.4 million 3kb reads, 9.6X; 3.7 million 8kb reads, 4.9X) were derived from a pool of individuals. In total about 22.6 million reads were generating representing 38.9x coverage of this sand fly genome.

The Llon_1.0 assembled draft genome sequence was built from the data described above using the Celera CABOG assembler (version 6.1, 2010/03/22). Next, these initial results were used as a backbone for longer superscaffolds using Baylor's ATLAS-link. Finally, discernible gaps filled with ATLAS-gapfill and ATLAS-gapmerge. The final assembly includes these superscaffolds, which can be ordered and oriented with respect to each other, and isolated sequences that could not be linked (single contig scaffolds or singletons from the original CABOG assembly).

The total length of all contigs is 142.7 Mb; however, the total span of the assembly is 154.2 Mb after gaps are included. The N50 of the contigs is 7.5 kb and the N50 of the scaffolds is 85.1 kb. Both the assembly and the description of the positions and orientations of contigs (AGP) are available from the Sand fly section of the BCM-HGSC web site at:

LlonJ1.6 gene set

Community annotation patch build for July 2019.



AssemblyLlonJ1, INSDC Assembly GCA_000265325.1, Jun 2012
Database version108.1
Golden Path Length154,229,266
Genebuild byVEuPathDB
Genebuild methodImport
Data sourceVectorBase

Gene counts

Coding genes10,427
Non coding genes338
Small non coding genes334
Long non coding genes4
Gene transcripts10,796


Snap gene prediction37,229
Short Variants4,821,847