Take NCBI accession2taxid files, keep only accession and taxa and save it as a SQLite database
Usage
read.accession2taxid(
  taxaFiles,
  sqlFile,
  vocal = TRUE,
  extraSqlCommand = "",
  indexTaxa = FALSE,
  overwrite = FALSE
)Arguments
- taxaFiles
- a string or vector of strings giving the path(s) to files to be read in 
- sqlFile
- a string giving the path where the output SQLite file should be saved 
- vocal
- if TRUE output status messages 
- extraSqlCommand
- for advanced use. A string giving a command to be called on the SQLite database before loading data. A couple potential uses: - "pragma temp_store = 2;" to keep all SQLite temp files in memory. Don't do this unless you have a lot (>100 Gb) of RAM 
 
- indexTaxa
- if TRUE add an index for taxa ID. This would only be necessary if you want to look up accessions by taxa ID e.g. - getAccessions
- overwrite
- If TRUE, delete accessionTaxa table in database if present and regenerate 
Examples
taxa<-c(
  "accession\taccession.version\ttaxid\tgi",
  "Z17427\tZ17427.1\t3702\t16569",
  "Z17428\tZ17428.1\t3702\t16570",
  "Z17429\tZ17429.1\t3702\t16571",
  "Z17430\tZ17430.1\t3702\t16572"
)
inFile<-tempfile()
sqlFile<-tempfile()
writeLines(taxa,inFile)
read.accession2taxid(inFile,sqlFile,vocal=FALSE)
db<-RSQLite::dbConnect(RSQLite::SQLite(),dbname=sqlFile)
RSQLite::dbGetQuery(db,'SELECT * FROM accessionTaxa')
#>     base accession taxa
#> 1 Z17427  Z17427.1 3702
#> 2 Z17428  Z17428.1 3702
#> 3 Z17429  Z17429.1 3702
#> 4 Z17430  Z17430.1 3702
RSQLite::dbDisconnect(db)