Welcome to Academic Tracker’s Documentation!
Academic Tracker
Academic Tracker was created to automate the process of making sure that federally funded publications get listed on PubMed and that the grant funding source for them is cited.
Academic Tracker is a command line tool to search PubMed, ORCID, Google Scholar, and Crossref for publications. The program can either be given a list of authors to look for publications for, or references/citations to publications themselves. The program will then will look for publications on the aforementioned sources and return what relevant information is available from those sources.
The primary use case is to give the program a list of authors to find publications for. From this list of publications it can then be determined which ones need further action to be in compliance.
A secondary use case for finding author’s publications is to create a report of the collaborators they have worked with, and can be done by specifying the creation of that report in the configuration file. Details on reports are in the documentation.
The other primary use case is to give the program a list of publication references to find information for.
Links
Installation
The Academic Tracker package runs under Python 3.8+. Use pip to install. Starting with Python 3.4, pip is included by default. Be sure to use the latest version of pip as older versions are known to have issues grabbing all dependencies. Academic Tracker relies on sendmail to send emails, so if you need to use that feature be sure sendmail is installed and in PATH.
Install on Linux, Mac OS X
python3 -m pip install academic-tracker
Install on Windows
py -3 -m pip install academic-tracker
Upgrade on Linux, Mac OS X
python3 -m pip install academic-tracker --upgrade
Upgrade on Windows
py -3 -m pip install academic-tracker --upgrade
Quickstart
Academic Tracker has several commands and options. The simplest most common use case is simply:
academic_tracker author_search config_file.json
Example config files can be downloaded from the example_configs directory of the GitHub.
Academic Tracker’s behavior can be quite complex though, so it is highly encouraged to read the guide and tutorial.
Creating The Configuration JSON
A configuration JSON file is required to run Academic Tracker, but initially creating it the first time can be burdensome. Unfortunately, there is no easy solution for this. It is encouraged to read the configuration JSON section in jsonschema and use the example there to create it initially. The add_authors command can help with building the Authors section if you already have a csv file with author information. A good tool to help track down pesky JSON syntax errors is here. There are also examples in the example_configs directory of the GitHub repo. There are also more examples in the supplemental material of the paper https://doi.org/10.6084/m9.figshare.19412165.
Registering With ORCID
In order to have this program search ORCID you must register with ORCID and obtain a key and secret. Details on how to do that are here. If you do not want to do that then the –no_ORCID option can be used to skip searching ORCID, or don’t include the ORCID_search section in the config file.
Mac OS Note
When you try to run the program on Mac OS you may get an SSL error.
certificate verify failed: unable to get local issuer certificate
This is due to a change in Mac OS and Python. To fix it go to to your Python folder in Applications and run the Install Certificates.command shell command in the /Applications/Python 3.x folder. This should fix the issue.
Email Sending Note
Academic Tracker uses sendmail to send emails, so any system it is going to be used on needs to have sendmail installed and the path in PATH. If you try to send emails without this the program will display a warning. This can be avoided by using the –test option though. The –test option blocks email sending. Email sending can also be avoided by leaving the from_email attribute out of the report sections of the configuration JSON file.
How Publications Are Matched
When searching by publications it is necessary to confirm that the publication in the given reference matches the publication returned in the query. This is done by either matching the DOIs, PMIDs, or the title and at least one author. Titles are fuzzy matched using fuzzywuzzy which is why at least one author must also be matched. Author’s are matched using last name and at least one affiliation.
Troubleshooting Errors
If you experience errors when running Academic Tracker the first thing to do is simply try again. Since Academic Tracker is communicating with multiple web sources it is not uncommon for a problem to occur with one of these sources. It might also be a good idea to wait several hours or the next day to try again if there is a communication issue with a particular source. You can also use the various “–no_Source” options for whatever source is causing an error. For example, if Crossref keeps having 504 HTTP errors you can run with the –no_Crossref option. If the issue persists across multiple runs then try upgrading Academic Tracker’s dependencies with “pip install –upgrade dependency_name”. The list of dependencies is in the guide.
License
This package is distributed under the BSD license.
Documentation index:
- User Guide
- Tutorial
- Configuration JSON File
- Outputs
- Search For Publications By Author
- Search For Publications By Reference
- Find ORCID IDs for Authors
- Find Scholar IDs for Authors
- Add Or Update Authors In Configuration JSON
- Tokenize A Reference
- Generate Reports And Emails Like Author Search
- Generate Reports And Emails Like Reference Search
- JSON Schema
- Reporting
- Tokenization
- API
- User Input Checking
cli_inputs_check()
config_file_check()
config_report_check()
prev_pubs_file_check()
ref_config_file_check()
tok_reference_check()
tracker_validate()
- Author Search Modularized
build_publication_dict()
generate_internal_data_and_check_authors()
input_reading_and_checking()
save_and_send_reports_and_emails()
- Author Search Webio
search_Crossref_for_pubs()
search_Google_Scholar_for_pubs()
search_ORCID_for_pubs()
search_PubMed_for_pubs()
- Author Search Emails and Reports
build_author_loop()
create_collaborator_report()
create_collaborators_reports_and_emails()
create_project_report()
create_project_reports_and_emails()
create_pubs_by_author_dict()
create_summary_report()
create_tabular_collaborator_report()
create_tabular_project_report()
create_tabular_summary_report()
- Reference Search Modularized
build_publication_dict()
input_reading_and_checking()
save_and_send_reports_and_emails()
- Reference Search Webio
build_pub_dict_from_PMID()
parse_myncbi_citations()
search_references_on_source()
tokenize_reference_input()
- Reference Search Emails and Reports
convert_tokenized_authors_to_str()
create_report_from_template()
create_tabular_report()
create_tokenization_report()
- Citation Parsing
parse_MEDLINE_format()
parse_text_for_citations()
tokenize_APA_or_Harvard_authors()
tokenize_MLA_or_Chicago_authors()
tokenize_Vancouver_authors()
tokenize_myncbi_citations()
- Fileio
load_json()
read_csv()
read_previous_publications()
read_text_from_docx()
read_text_from_txt()
save_emails_to_file()
save_json_to_file()
save_publications_to_file()
save_string_to_file()
- Helper Functions
adjust_author_attributes()
are_citations_in_pub_dict()
create_authors_by_project_dict()
create_pub_dict_for_saving_Crossref()
create_pub_dict_for_saving_PubMed()
do_strings_fuzzy_match()
extract_ORCID_from_string()
find_common_subphrases()
find_duplicate_citations()
fuzzy_matches_to_list()
get_pub_id_in_publication_dict()
is_fuzzy_match_to_list()
is_pub_in_publication_dict()
match_authors_in_prev_pub()
match_pub_authors_to_citation_authors()
match_pub_authors_to_config_authors()
normalize_DOI()
regex_group_return()
regex_match_return()
regex_search_return()
vprint()
- Webio
clean_tags_from_url()
get_DOI_from_Crossref()
get_url_contents_as_str()
search_Google_Scholar_for_ids()
search_ORCID_for_ids()
send_emails()
- Emails and Reports Helpers
- License
- TODO List
- Change Log