Releases: ropensci/taxize
Releases · ropensci/taxize
taxize v0.9.8
NEW FEATURES
- all
get_*
functions gain some new features (associated new fxns aretaxon_last
andtaxon_clear
): a) nicer messages printed to the console when iterating through taxa, and a summary at the end of what was done; and b) state is now saved when runningget_*
functions. That is, in an object external to theget_*
function call we keep track of what happened, so that if an error is encountered, you can easily restart where you left off; this is especially useful when dealing with a large number of inputs to aget_*
function. To utilize, pass the output oftaxon_last()
to aget_*
function call. Associated with these changes are new package imports: R6, crayon and cli (#736) (#757) - gains a new function
taxize_options()
to set options when using taxize. the first reason for the function is to set two options for the above item forget_*
functions:taxon_state_messages
to allow taxon state tracking messages inget_*
functions or not, andquiet=TRUE
quiets output from thetaxize_options()
function itself
MINOR IMPROVEMENTS
- in
id2name()
andworms_downstream()
useworrms::wm_record
instead ofworrms::wm_record_
for newest version ofworrms
(#760) - many
get_*
functions andcol_downstream()
parameterverbose
changed tomessages
to not conflict with averbose
curl options parameter passed in tocrul
BUG FIXES
- fix to http request processing for COL - sometimes errors, and gives a message in the response body, but DOES NOT give the appropriate error HTTP status code - need to always do a check for COL responses (#755) (#756) thanks @dougwyu
- fix to
gbif_downstream()
- GBIF in some cases returns a rank of "unranked", which we hadn't accounted for in internal rank processing code (#758) thanks @ocstringham
taxize v0.9.7
taxize v0.9.6
NEW FEATURES
- gains new functions for Kew's Plants of the World:
get_pow()
,get_pow_()
,as.pow()
,classification.pow()
,pow_search()
, andpow_lookup()
(#598) (#739) - we now pass a user agent string in all HTTP requests to the various data sources so they know its coming from
taxize
. the string will look something liker-curl/3.3 crul/0.7.0 rOpenSci(taxize/0.9.6)
, including the versions of thecurl
R pkg, thecrul
package, and thetaxize
package (#662) - change to
get_colid
functionality: we weren't paginating for the user when there were more than 50 results for a query; we now paginate for the user using async HTTP requests; this means that some requests will take longer than they did before if they have more than 50 results; this is a good change given that you get all the results for your query now (#743) - change across most
get_*
functions: in some of theget_*
functions we tried for a direct match (e.g.,"Poa" == "Poa"
) and if one was found, then we were done and returned that record. however, we didn't deploy the same logic across allget_*
functions. Now allget_*
functions check for a direct match. Of course if there is a direct match with more than 1 result, you still get the prompt asking you which name you want. (#631) (#734)
MINOR IMPROVEMENTS
- Make separate
taxize-authentication
manual file covering authentication information across the package (#681) - new case study vignette added (#544) (#721) thanks @fozy81
- add note to
gnr_resolve()
docs about age of datasets used in the Global Names Resolver, and how to access age of datasets (#737) get_eolid()
fixes: gains new attributepageid
;uri
's given are updated to EOL's new URL format;rank
anddatasource
parameters were not documented, now are; we no longer use short names for data sources within EOL, but instead use their full names (#702) (#742)col_search()
now returns attributes on the output data.frame's with number of results found and returned, and other metadata about the searchgnr_datasources()
loses thetodf
parameter; now always returns a data.frame and the data.frame has all the columns, whereas the default call returned a limited set of columns in previous versions
BUG FIXES
- fix bug in
get_wormsid()
, was failing when there was a direct match found with more than 1 result (#740) - fix across all
get_*
functions: linting of the input to therows
parmeter was failing with a vector of values in some cases (#741) - fix to
iucn_summary()
; we weren't passing on the API key internally correctly (#735) thanks @PrincessPi314 for the report
taxize v0.9.5
Compare to previous release
DEFUNCT
iucn_summary_id()
is defunct, useiucn_summary()
instead
NEW FEATURES
col_downstream()
gains parameterextant_only
(logical) to optionally keep extant taxa only (#714) thanks @ArielGreiner for the inquirydownstream()
gains anotherdb
options: Worms. You can now setdb="worms"
to use Worms to get taxa downstream from a target taxon. In addition,taxize
gains new functionworms_downstream()
, which is used under the hood indownstream(..., db="worms")
(#713) (#715)- gains new function
id2name()
withdb
options for tol, itis, ncbi, worms, gbif, col, and bold. the function converts taxonomid IDs to names. It's sort of the inverse of theget_*()
family of functions. (#712) (#716) tax_rank()
gains new parameterrows
so that one can passrows
down toget_*()
functions
MINOR IMPROVEMENTS
synonyms()
warning from an internalcbind()
call now fixed (#704) (#705) thanks @vijaybarve- namespace
taxize
function calls thrown when notifying users about API keys (e.g.,taxize::use_tropicos()
) to make it very clear where the functions live (to avoid confusion withusethis
) (#724) (#725) thanks @maelle - changed
iucn_summary()
to output the same structure when no match is found as when a match is found so that when output is passed toiucn_status()
behavior is the same (#708) thanks @Rekyt - skip
tax_name()
tests on CRAN (#728) httr
replaced bycrul
throughout (#590)- most unit tests that make HTTP requests now cached with
vcr
, making tests much faster and not prone to errors to remote services being down (#729) - EOL: The EOL API underwent major changes, and we've attempted to get things in working order.
eol_dataobjects()
gains new parameterlanguage
.eol_pages()
losesiucn
,images
,videos
,sounds
,maps
, andtext
parameters, and gainsimages_per_page
,videos_per_page
,sounds_per_page
,maps_per_page
,texts_per_page
, andtexts_page
. Please do let us know if you find any problems with any EOL functions (#717) (#718) - As part of EOL changes, the default
db
value forcomm2sci()
andsci2comm()
is nowncbi
instead ofeol
- EUBON base URL now https instead of http
- A number of
get_*()
functions changed parameterverbose
tomessages
to not conflict withverbose
passed down tocrul::HttpClient
- ping functions:
ncbi_ping()
reworked to allow use of your api key as a parameter or pulled from your environemnt;eol_ping()
using https instead of http, and parsing JSON instead of XML.
BUG FIXES
get_eolid()
was erroring when no results found for a query due to not assigning an internal variable (#701) (#709) thanks for the fix @taddallasget_tolid()
was erroring when values wereNULL
- now replacing allNULLL
withNA_character_
to makedata.table::rbindlist()
happy (#710) (#711) thanks @gpli for the fix- add additional rows to the
rank_ref
data.frame of taxonomic ranks: species subgroup, forma, varietas, clade, megacohort, supercohort, cohort, subcohort, infracohort. when there's no matched rank errors can result in many of the downstream functions. The data.frame now has 43 rows. (#720) (#727) - fix to
downstream()
andncbi_get_taxon_summary()
: change inncbi_get_taxon_summary
to break up queries into smaller chunks to avoid HTTP 414 errors ("URI too long") (#727) (#730) thanks for reporting @fischhoff and @benjaminschwetz - a number of fixes internally (not user facing) to comply with upcoming R-devel changes for checking length greater than 1 in logical statements (#731)
v0.9.4
NEW FEATURES
- new contributor: Gaopeng Li
- gains new functions for helping the user get authentication keys/tokens:
use_entrez()
,use_eol()
,use_iucn()
(which uses internallyrredlist::rl_use_iucn()
), anduse_tropicos()
(#682) (#691) (#693) By @maelle
MINOR IMPROVEMENTS
- remove commented out code
BUG FIXES
- fix
tropicos_ping()
- fixed
downstream()
andgbif_downstream()
: some of the results don't have acanonicalName
, so now safely try to get that field (#673) - fixed
as.uid()
, was erroring when passing in a taxon ID (#674) (#675) by @zachary-foster - fix in
get_boldid()
(and by extensionclassification(..., db = "bold")
): was failing when no parent taxon found, just fill in with NA now (#680) - fix to
synonyms()
: was failing for some TSNs fordb="itis"
(#685) - fix to
tax_name()
:rows
arg wasn't being passed on internally (#686) - fix to
gnr_resolve()
andgnr_datasources()
: problems were caused by http scheme, switched to use https instead of http (#687) - fix to
class2tree()
: organisms with unique rank lower than non-unique ranks will give extra wrong rows (#689) (#690) thanks @gpli - fix in
ncbi_get_taxon_summary()
: changes in the NCBI API most likely lead to HTTP 414 (URI Too Long) errors. we now loop internally for the user. By extension this helps problems upsteam indownstream()
/ncbi_downstream()
/ncbi_children()
(#698) - fix in
class2tree()
: was erroring when name strings contained pound signs (e.g.,#
) (#699) (#700) thanks @gpli
taxize v0.9.3
MINOR IMPROVEMENTS
- package gains three new authors: Bastian Greshake Tzovaras, Philippe Marchand, and Vinh Tran
- Don't enforce rate limiting via
Sys.sleep
for NCBI requests if the user has an API key (#667) - Fix to all functions that do NCBI requests to work whether or not a user has an NCBI API key (#668)
- Increased documentation on authentication, see
?taxize-authentication
- Further conversion of
verbose
tomessages
across the package so that supressing calls tomessage()
do not conflict with curl options passed in - Converted
genbank2uid()
andncbi_get_taxon_summary()
to usecrul
instead ofhttr
for HTTP requests
BUG FIXES
- Fix to
get_tolid()
: it was missing assignment of theatt
attribute internally, causing failures in some cases (#663) (#672) - Fix to
ncbi_children()
(and thuschildren()
when requesting NCBI data) to not fail when there is an empty result from the internal call toclassification()
(#664) thanks @arendsee
taxize v0.9.2
Installation
Stalled on CRAN. Install like
install.packages("taxize", repos = c("http://packages.ropensci.org"))
OR
remotes::install_github("ropensci/taxize")
# OR
devtools::install_github("ropensci/taxize")
NEWS
NEW FEATURES
class2tree()
gets a major overhaul thanks to @gedankenstuecke and @trvinh (!!). The function now takes unnamed ranks into account when clustering, which fixes problem where trees were unresolved for many splits as the named taxonomy levels were shared between them. Now it makes full use of the NCBI Taxonomy string, including the unnamed ranks, leading to higher resolution trees that have less multifurcations (#611) (#634)- Added support throughout package for use of NCBI Entrez API keys - NCBI now strongly encourages their use and you get a higher rate limit when you use one. See
?taxize-authentication
for help. Importantly, note that API key names (both R options and environment variables) have changed. They are now the same for R options and env vars: TROPICOS_KEY, EOL_KEY, PLANTMINER_KEY, ENTREZ_KEY. You no longer need an API key for Plantminer. (#640) (#646) - New author Zebulun Arendsee (@arendsee)
- New package dependencies:
crul
andzoo
MINOR IMPROVEMENTS
- In
downstream()
we now pass onlimit
andstart
parameters togbif_downstream()
; we weren't doing that before; the two parameters control pagination (#638) genbank2uid()
now returns the correct ID when there are multiple possibilities and invalid IDs no longer make whole batches fail (#642) thanks @zachary-fosterchildren()
outputs made more consistent for certain cases when no results found for searches (#648) (#649) thanks @arendsee- Improve
downstream()
by passing...
(additional parameters) down toncbi_children()
used internally. allows e.g., use ofambiguous
parameter inncbi_children()
allows you to remove ambiguousl named nodes (#653) (#654) thanks @arendsee - swapped out use of
httr
forcrul
in EOL and Tropics functions - note that this won't affect you unless you're passing curl options. see packagecrul
for help on curl options. Along with this change, the parameterverbose
has changed tomessages
(for toggling printing of information messages)
DOCUMENTATION
- Added additional text to the
CONTRIBUTING.md
file for how to contribute to the test suite (#635)
BUG FIXES
genbank2uid
now returns the correct ID when there are multiple possibilities and invalid IDs no longer make whole batches fail.- Fix to
downstream()
: passing numeric taxon ids to the function while usingdb="ncbi"
wasn't working (#641) thanks @arendsee - Fix to
children()
: passing numeric taxon ids to the function while usingdb="worms"
wasn't working (#650) (#651) thanks @arendsee synonyms_df()
- that attemps to combine many outputs from thesynonyms()
function - now removes NA/NULL/empy outputs before attempting the combination (#636)- Fix to
gnr_resolve()
: before ifpreferred_data_sources
was used, you would get the preferred data but only a few columns of the response. We now return all fields; however, we only return the preferred data part when that parameter is used (#656) - Fixes to
children()
. It was returning unexpected results for amgiguous taxonomic names (e.g., there's some insects that are returned when searching within Bacteria). It was also failing when one tried to get the children of a root taxon (e.g., the children of the NCBI id 131567). (#639) (#647) fixed via PR (#659) thanks @arendsee and @zachary-foster
taxize v0.9.0
Changes to get_*()
functions
- Added separate documentation file for all get* functions
describing attributes and various exception behaviors - Some
get*()
functions hadNaN
as defaultrows
parameter
value. Those all changed toNA
- Better failure behavior now when non-acceptable
rows
parameter value given - Added in all type checks for parameters across
get_*()
functions - Changed behavior across all
get_*()
functions to behave the
same whenask = FALSE, rows = 1
andask = TRUE, rows = 1
as these
should result in the same outcome. (#627) thanks @zachary-foster ! - Fixed direct match behavior so that when there's multiple results
from the data provider, but no direct match, that the functions don't
give back justNA
with no inication that there were multiple matches. - Please let me know if any of these changes cause problems for your
code or package.
NEW FEATURES
- Change
comm2sci()
to S3 setup with methods forcharacter
,uid
,
andtsn
(#621) iucn_status()
now has S3 setup with a single method that only handles
output from theiucn_summary()
function.
MINOR IMPROVEMENTS
- Add required
key
parameter to fxniucn_id()
(#633) - imrove docs for
sci2comm()
: to indicate how to get non-simplified
output (which includes what language the common name is from) vs.
getting simplified output (#623) thanks @glaroc ! - Fix to
sci2comm()
to not be case sensitive when looking for matches
(#625) thanks @glaroc ! - Two additional columns now returned with
eol_search()
:link
andcontent
- Improve docs in
eol_search()
to describe returneddata.frame
- Fix
bold_bing()
to use new base URL for their API - Improved description of the dataset
rank_ref
, see?rank_ref
BUG FIXES
- Fix to
downstream()
via fix torank_ref
dataset to include
"infraspecies" and make "unspecified" and "no rank" requivalent.
Fix tocol_downstream()
to remove properly ranks lower than
allowed. (#620) thanks @cdeterman ! iucn_summary
: changed to usingrredlist
package internally.
sciname
param changed tox
.iucn_summary_id()
now is
deprecated in favor ofiucn_summary()
.iucn_summary()
now has a
S3 setup, with methods forcharacter
andiucn
(#622)- Added "cohort" to
rank_ref
dataset as that rank sometimes used
at NCBI (from bug reported inncbi_downstream()
) (#626) - Fix to
sci2comm()
, addtryCatch()
to internals to catch
failed requests for specific pageid's (#624) thanks @glaroc ! - Fix URL for taxa for NBN taxonomic ids retrieved via
get_nbnid()
(#632)
taxize v0.8.9
taxize v0.8.8
NEW FEATURES
- New function
ncbi_downstream()
and now NCBI is an option in
the functiondownstream()
(#583) thanks for the push @andzandz11 - New data source: Wiki*, which includes Wikipedia, Wikispecies, and
Wikidata - you can choose which you'd like to search. Uses new package
wikitaxa
, with contributions from @ezwelty (#317) scrapenames()
gains a parameterreturn_content
, a boolean, to
optionally return the OCR content as a text string with the results. (#614)
thanks @fgabriel1891- New function
get_iucn()
- to get IUCN Red List ids for taxa. In addition,
new S3 methodssynonyms.iucn
andsci2comm.iucn
- no other methods could
be made to work with IUCN Red List ids as they do no share their taxonomic
classification data (#578) thanks @diogoprov
MINOR IMPROVEMENTS
bold
now an option inclassification()
function (#588)- fix to NBN to use new base URL (#582) ($597)
genbank2uid()
can give back more than 1 taxon matched to a given
Genbank accession number. Now the function can return more than one
match for each query, e.g., trygenbank2uid(id = "AM420293")
(#602)
thanks @sariya- had to modify
cbind()
usage to incclude...
for method
consistency (#612) tax_rank()
used to be able to do only ncbi and itis. Can now do a
lot more data sources: ncbi, itis, eol, col, tropicos, gbif, nbn,
worms, natserv, bold (#587)- Added to
classification()
docs in a sectionLots of results
a
note about how to deal with results when there are A LOT of them. (#596)
thanks @ahhurlbert for raising the issue tnrs()
now returns the resulting data.frame in the oder of the
names passed in by the user (#613) thanks @wpetry- Changes to
gnr_resolve()
to now strip out taxonomic names submitted
by user that are NA, or zero length strings, or are not of class
character (#606) - Added description of the columns of the data.frame output in
gnr_resolve()
(#610) thanks @kamapu - Added noted in
tnrs()
docs that the service doesn't provide any
information about homonyms. (#610) thanks @kamapu - Added
parvorder
to thetaxize
rank_ref
dataset - used by NCBI -
if tax returned with that rank, some functions intaxize
were failing
due to that rank missing in our reference datasetrank_ref
(#615)
BUG FIXES
- Fix to
get_colid()
via problem in parsing withincol_search()
(#585) - Fix to
gbif_downstream
(and thus fix indownstream()
): there
was two rows with form in ourrank_ref
reference dataset of rank names,
causing > 1 result in some cases, then causingvapply
to fail as it's
expecting length 1 result (#599) thanks @andzandz11 - Fix
genbank2uid()
: was failing when getting more than 1 result back,
works now (#603) and fails better now, giving back warnings/error messages
that are more informative (see also #602) thanks @sariya - Fix to
synonyms.tsn()
: in some cases a TSN has > 1 accepted name. We
get accepted names first from the TSN, then look for synonyms, and hadn't
accounted for > 1 accepted name. Fixed now (#607) thanks @tdjames - Fixed bug in
sci2comm()
- was not dealing internally with passing
thesimplify
parameter (#616)