User Guide
Description
Academic Tracker was created to automate the process of making sure that federally funded publications get listed on PubMed and that the grant funding source for them is cited.
Academic Tracker searches PubMed, ORCID, Crossref, and Google Scholar to look for publications. The 2 main use cases allows users to search by author names or a publication citation/reference. The output is customizable by the user, but in general will be a JSON file of publication information, a JSON file of email information if emails were sent, and text files of summary information.
A secondary use case of searching by author names is to create a report of the collaborators they have worked with. This can be done by specifying the creation of that report in the configuration file. Details on reports are in the documentation.
Installation
The Academic Tracker package runs under Python 3.7+. Use pip to install. Starting with Python 3.4, pip is included by default.
Install on Linux, Mac OS X
python3 -m pip install academic_tracker
Install on Windows
py -3 -m pip install academic_tracker
Upgrade on Linux, Mac OS X
python3 -m pip install academic_tracker --upgrade
Upgrade on Windows
py -3 -m pip install academic_tracker --upgrade
Install inside virtualenv
For an isolated install, you can run the same inside a virtualenv.
$ virtualenv -p /usr/bin/python3 venv # create virtual environment, use python3 interpreter
$ source venv/bin/activate # activate virtual environment
$ python3 -m pip install academic_tracker # install academic_tracker as usual
$ deactivate # if you are done working in the virtual environment
Get the source code
Code is available on GitHub: https://github.com/MoseleyBioinformaticsLab/academic_tracker
You can either clone the public repository:
$ https://github.com/MoseleyBioinformaticsLab/academic_tracker.git
Or, download the tarball and/or zipball:
$ curl -OL https://github.com/MoseleyBioinformaticsLab/academic_tracker/tarball/main
$ curl -OL https://github.com/MoseleyBioinformaticsLab/academic_tracker/zipball/main
Once you have a copy of the source, you can embed it in your own Python package, or install it into your system site-packages easily:
$ python3 setup.py install
Dependencies
The Academic Tracker package depends on several Python libraries. The pip
command
will install all dependencies automatically, but if you wish to install them manually,
run the following commands:
- jsonschema for validating JSON.
To install the jsonschema Python library run the following:
python3 -m pip install jsonschema # On Linux, Mac OS X py -3 -m pip install jsonschema # On Windows
- beautifulsoup4 for parsing webpages.
To install the beautifulsoup4 Python library run the following:
python3 -m pip install beautifulsoup4 # On Linux, Mac OS X py -3 -m pip install beautifulsoup4 # On Windows
- fuzzywuzzy for fuzzy matching publication titles.
To install the fuzzywuzzy Python library run the following:
python3 -m pip install fuzzywuzzy # On Linux, Mac OS X py -3 -m pip install fuzzywuzzy # On Windows
- python-docx for reading docx files.
To install the python-docx Python library run the following:
python3 -m pip install python-docx # On Linux, Mac OS X py -3 -m pip install python-docx # On Windows
Basic usage
Academic Tracker expects at least a configuration JSON file, and possibly more depending on the usage. The 2 main use cases are author_search and reference_search, with the other usages mostly included to support those. author_search searches by the authors given in the configuration JSON file while reference_search searches by the publication references given in the reference file or URL. Details about the JSON files are in the JSON Schema section, and more information about the use cases with examples are in the Tutorial section.
Usage:
academic_tracker author_search <config_json_file> [--test]
[--prev_pub=<file-path> --prev-pub=<file-path>]
[--save-all-queries]
[--no-GoogleScholar --no_GoogleScholar]
[--no-ORCID --no_ORCID]
[--no-Crossref --no_Crossref]
[--no-PubMed --no_PubMed]
[--verbose --silent]
academic_tracker reference_search <config_json_file> <references_file_or_URL> [--test]
[--prev-pub=<file-path> --prev_pub=<file-path>]
[--save-all-queries]
[--PMID-reference --PMID_reference]
[--MEDLINE-reference --MEDLINE_reference]
[--keep-duplicates]
[--no-Crossref --no_Crossref]
[--no-PubMed --no_PubMed]
[--verbose --silent]
academic_tracker find_ORCID <config_json_file> [--verbose --silent]
academic_tracker find_Google_Scholar <config_json_file> [--verbose --silent]
academic_tracker add_authors <config_json_file> <authors_file> [--verbose --silent]
academic_tracker tokenize_reference <references_file_or_URL> [--MEDLINE-reference --MEDLINE_reference]
[--keep-duplicates]
[--verbose --silent]
academic_tracker gen_reports_and_emails_auth <config_json_file> <publication_json_file> [--test --verbose --silent]
academic_tracker gen_reports_and_emails_ref <config_json_file> <references_file_or_URL> <publication_json_file> [--test]
[--prev-pub=<file-path> --prev_pub=<file-path>]
[--MEDLINE-reference --MEDLINE_reference]
[--keep-duplicates]
[--verbose --silent]
Options:
-h --help Show this screen.
-v --version Show version.
--verbose Print hidden error messages.
--silent Do not print anything to the screen.
--test Generate pubs and email texts, but do not send emails.
--prev-pub=<file-path> Filepath to json or csv with publication ids to ignore.
Enter "ignore" for the <file_path> to not look for previous publications.json files in tracker directories.
--prev_pub=<file-path> Deprecated. Use --prev-pub instead.
--save-all-queries Save all queried results from each source in "all_results.json".
--keep-duplicates After references are tokenized duplicate entries are removed, use this option not to remove duplicate entries.
Reference Type Options:
--PMID-reference Indicates that the reference_file is a PMID file and only PubMed info will be returned.
--PMID_reference Deprecated. Use --PMID-reference instead.
--MEDLINE-reference Indicates that the reference_file is a MEDLINE file.
--MEDLINE_reference Deprecated. Use --MEDLINE-reference instead.
Search Options:
--no-GoogleScholar Don't search Google Scholar.
--no_GoogleScholar Deprecated. Use --no-GoogleScholar instead.
--no-ORCID Don't search ORCID.
--no_ORCID Deprecated. Use --no-ORCID instead.
--no-Crossref Don't search Crossref.
--no_Crossref Deprecated. Use --no-Crossref instead.
--no-PubMed Don't search PubMed.
--no_PubMed Deprecated. Use --no-PubMed instead.