Parsing; UniProt 2.0 | ||
| This is only measuring the parsing and generating INSERT queries with some consistency checking. | ||
| SwissProt | TrEMBL | |
| Size of the flat-file: | 575M | 2379M |
| Parsing and preparing for inserts: | 10 min | 52 min |
Parsing and loading; UniProt 2.0 | ||
| flat-file: | N/A ( 0 min) | N/A ( 0 min) |
| ModBioSQL; MySQLdb: | 22 min | 99 min |
| ModBioSQL; PyPgSQL: | 22 min | 92 min |
Creating indexes; UniProt 2.0 | ||
| flat-file: | 27 min | |
| ModBioSQL; MySQLdb: | 46 min | |
| ModBioSQL; PyPgSQL: | 31 min | |
Creating constrains; UniProt 2.0 | ||
| ModBioSQL; MySQLdb: | 100 min | |
| ModBioSQL; PyPgSQL: | 68 min | |
Total time | ||
| BioSQL; SwissProt; MySQLdb: | 7h 40 min | |
| BioSQL; SwissProt; psycopg: | 7h 23 min | |
| ModBioSQL; SwissProt; MySQLdb: | 48 min | |
| ModBioSQL; SwissProt; psycopg: | 41 min | |
| ModBioSQL; UniProt; MySQLdb: | 4h 24 min | |
| ModBioSQL; UniProt; PyPgSQL: | 3h 33 min | |
Size on disk (approx.) | ||
| flat-file; UniProt: | 3,225M 766M 3,991M | dat files indexes total |
| BioSQL; SwissProt; +indexes; PgSQL: | 1.12G | |
| ModBioSQL; SwissProt; +indexes; PgSQL: | 706M | |
| ModBioSQL; UniProt; PgSQL: |
3,041M 780M 3,812M |
data indexes total |
Reading (100,000) random sequences | |
| flat-file: | 1,111,1111 seqs/s |
| BioSQL; SwissProt; psycopg: | 7,142 seqs/s |
| ModBioSQL; SwissProt; psycopg: | 14,556 seqs/s |
| ModBioSQL; SwissProt; PyPgSQL: | 3,600 seqs/s |
| ModBioSQL; UniProt; MySQLdb: | 5321 seqs/s |
| ModBioSQL; UniProt; PyPgSQL: | 2832 seqs/s |