Find a taxa by string in the NCBI taxonomy. Note that NCBI species are stored as Genus species e.g. "Bos taurus". Ambiguous taxa names will return a comma concatenated string e.g. "123,234" and generate a warning.
Value
a vector of character strings giving taxa IDs (potentially comma concatenated for any taxa with ambiguous names)
Examples
namesText<-c(
"1\t|\tall\t|\t\t|\tsynonym\t|",
"1\t|\troot\t|\t\t|\tscientific name\t|",
"3\t|\tMulti\t|\tBacteria <prokaryotes>\t|\tscientific name\t|",
"4\t|\tMulti\t|\tBacteria <prokaryotes>\t|\tscientific name\t|",
"2\t|\tBacteria\t|\tBacteria <prokaryotes>\t|\tscientific name\t|",
"2\t|\tMonera\t|\tMonera <Bacteria>\t|\tin-part\t|",
"2\t|\tProcaryotae\t|\tProcaryotae <Bacteria>\t|\tin-part\t|"
)
tmpFile<-tempfile()
writeLines(namesText,tmpFile)
sqlFile<-tempfile()
read.names.sql(tmpFile,sqlFile)
getId('Bacteria',sqlFile)
#> [1] "2"
getId('Not a real name',sqlFile)
#> [1] NA
getId('Multi',sqlFile)
#> Warning: Multiple taxa ids found for Multi. Collapsing with commas
#> [1] "3,4"